Rohini Palaniswamy created PIG-3878:
---------------------------------------

             Summary: Improve parallelism of union and join
                 Key: PIG-3878
                 URL: https://issues.apache.org/jira/browse/PIG-3878
             Project: Pig
          Issue Type: Sub-task
            Reporter: Rohini Palaniswamy
            Assignee: Rohini Palaniswamy
             Fix For: tez-branch


Currently if user has no parallel clause specified, then it defaults to 1 and 
it is bad for performance. MR does not have this issue as for each job number 
of mappers are determined by input splits and number for reducers by 
InputSizeReducerEstimator.  Automatic reducer parallelism for Tez in general 
will be handled in separate jiras. But a quick workaround can be done for joins 
and unions by setting the parallelism of the reduce task to be sum of join 
tasks till ARP is put in and better estimation is done. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to