[jira] Commented: (PIG-539) unable to control parallelism of Map tasks

Olga Natkovich (JIRA) Thu, 20 Nov 2008 10:41:08 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649442#action_12649442
 ]


Olga Natkovich commented on PIG-539:
------------------------------------

Chris,

I don't disagree with you. I am just saying that at this point, this is not  
abug, but a feature that breaks backward compatibility. If you feel strongly 
that this is something that we need to do, please, post your proposal to 
pig-user list so that others can comment on it. Also, there might be other 
ways, not just using parallel that might address your problem.

> unable to control parallelism of Map tasks
> ------------------------------------------
>
>                 Key: PIG-539
>                 URL: https://issues.apache.org/jira/browse/PIG-539
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>         Environment: local execution + hadoop execution
>            Reporter: Christopher Olston
>
> I put "PARALLEL 1" following *every* statement in my pig script, and it still 
> executes maps with more than 1 parallel task. This is a major problem because 
> for one of my operations I need to have a serialized (non-parallel) map.
> Probably the semantics of parallelism should be as follows:
>  1. group pig operators into map/reduce stages
>  2. for each stage, take the minimum of the "Parallel" directives given by 
> the user for statements executed as part of that stage
> (We'll have to decide on a rule for statements that use the combiner, which 
> execute partially on the map side and partially on the reduce side ...)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-539) unable to control parallelism of Map tasks

Reply via email to