set default_parallelism construct does not set the number of reducers correctly
-------------------------------------------------------------------------------

                 Key: PIG-1144
                 URL: https://issues.apache.org/jira/browse/PIG-1144
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.7.0
         Environment: Hadoop 20 cluster with multi-node installation
            Reporter: Viraj Bhat
             Fix For: 0.7.0


Hi all,
 I have a Pig script where I set the parallelism using the following set 
construct: "set default_parallel 100" . I modified the "MRPrinter.java" to 
printout the parallelism
{code}
...
public void visitMROp(MapReduceOper mr)
mStream.println("MapReduce node " + mr.getOperatorKey().toString() + " 
Parallelism " + mr.getRequestedParallelism());
...
{code}

When I run an explain on the script, I see that the last job which does the 
actual sort, runs as a single reducer job. This can be corrected, by adding the 
PARALLEL keyword in front of the ORDER BY.

Attaching the script and the explain output

Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to