[ https://issues.apache.org/jira/browse/PIG-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649442#action_12649442 ]
Olga Natkovich commented on PIG-539: ------------------------------------ Chris, I don't disagree with you. I am just saying that at this point, this is not abug, but a feature that breaks backward compatibility. If you feel strongly that this is something that we need to do, please, post your proposal to pig-user list so that others can comment on it. Also, there might be other ways, not just using parallel that might address your problem. > unable to control parallelism of Map tasks > ------------------------------------------ > > Key: PIG-539 > URL: https://issues.apache.org/jira/browse/PIG-539 > Project: Pig > Issue Type: Bug > Components: impl > Environment: local execution + hadoop execution > Reporter: Christopher Olston > > I put "PARALLEL 1" following *every* statement in my pig script, and it still > executes maps with more than 1 parallel task. This is a major problem because > for one of my operations I need to have a serialized (non-parallel) map. > Probably the semantics of parallelism should be as follows: > 1. group pig operators into map/reduce stages > 2. for each stage, take the minimum of the "Parallel" directives given by > the user for statements executed as part of that stage > (We'll have to decide on a rule for statements that use the combiner, which > execute partially on the map side and partially on the reduce side ...) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.