[ https://issues.apache.org/jira/browse/PIG-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764990#action_12764990 ]
Jeff Zhang commented on PIG-16: ------------------------------- I looks like this item has already been fixed in PIG-895 > setting parallel from grunt via set command > ------------------------------------------- > > Key: PIG-16 > URL: https://issues.apache.org/jira/browse/PIG-16 > Project: Pig > Issue Type: Improvement > Components: grunt > Reporter: Olga Natkovich > Priority: Minor > > I'd like to propose a different model which uses the grunt "set" option > and/or a command line option which sets reduce > parallelism to the be true and automatic. > set reduce_parallelism TRUE > set reduce_parallelism FALSE [Default - BTW, why is this the default?] > This way I won't have to update my script every single time I try playing > with -D"hod=-m N", parallelism for reduce > statements will default, appropriately, to 2*(N-1). > Alternatively, could I just specify PARALLEL with no value or PARALLEL > DEFAULT; And any time I needed to force reduce > to be single job, I could write PARALLEL 1. > Basically, this whole thing tripped me up for a long time and I just haven't > understood if there is a really good > reason to not make parallelism. > I guess it might be if you have aggregation functions that do not parallelize. > If this is the case, then it seems to me that this should be detectable > automagically based on whether the function is > a vanilla EvalFunction or if it is an AlgebraicFunction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.