set default_parallelism construct does not set the number of reducers correctly -------------------------------------------------------------------------------
Key: PIG-1144 URL: https://issues.apache.org/jira/browse/PIG-1144 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Environment: Hadoop 20 cluster with multi-node installation Reporter: Viraj Bhat Fix For: 0.7.0 Hi all, I have a Pig script where I set the parallelism using the following set construct: "set default_parallel 100" . I modified the "MRPrinter.java" to printout the parallelism {code} ... public void visitMROp(MapReduceOper mr) mStream.println("MapReduce node " + mr.getOperatorKey().toString() + " Parallelism " + mr.getRequestedParallelism()); ... {code} When I run an explain on the script, I see that the last job which does the actual sort, runs as a single reducer job. This can be corrected, by adding the PARALLEL keyword in front of the ORDER BY. Attaching the script and the explain output Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.