Dear Pig Users, My name is Aravind Srinivasan and am the Product Manager for Pig at Yahoo. The Pig team would love to get your feedback on the proposal below. Basically we are trying to figure out if this enhancement would break backwards compatibility for your system and if so, what are your thoughts on the trade-off between the cost and the benefit. Please drop me an e-mail ([email protected]) if you have an opinion on this.
Summary: Currently, if PARALLEL is not specified, the default value is 1 which most of time is not what users want and ends up causing some problems in the clusters in the past. The proposal is to use some very basic heuristic based on the input size to set a better value. This can be issues for users who expect just a single part file in the output. Jira for your reference: https://issues.apache.org/jira/browse/PIG-1249 Thanks, Aravind
