[ 
https://issues.apache.org/jira/browse/PIG-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649439#action_12649439
 ] 

Christopher Olston commented on PIG-539:
----------------------------------------

Custom slicer is a pretty heavyweight thing for users. It would be nice to 
control the map parallelism via the PARALLEL keyword if the default slicer is 
in use. Advanced users can do fancier things with a custom slider.

Pig is about "making hadoop easy", remember.

> unable to control parallelism of Map tasks
> ------------------------------------------
>
>                 Key: PIG-539
>                 URL: https://issues.apache.org/jira/browse/PIG-539
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>         Environment: local execution + hadoop execution
>            Reporter: Christopher Olston
>
> I put "PARALLEL 1" following *every* statement in my pig script, and it still 
> executes maps with more than 1 parallel task. This is a major problem because 
> for one of my operations I need to have a serialized (non-parallel) map.
> Probably the semantics of parallelism should be as follows:
>  1. group pig operators into map/reduce stages
>  2. for each stage, take the minimum of the "Parallel" directives given by 
> the user for statements executed as part of that stage
> (We'll have to decide on a rule for statements that use the combiner, which 
> execute partially on the map side and partially on the reduce side ...)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to