[ 
https://issues.apache.org/jira/browse/PIG-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326829#comment-14326829
 ] 

Rohini Palaniswamy commented on PIG-4424:
-----------------------------------------

This usecase keeps coming up often. Had a discussion with [~daijy] last year. 
The plan was to allow "set" commands anywhere in in a pig script and make them 
apply to all the lines following them. 

For eg:
set pig.maxCombinedSplitSize 1073741824
A = LOAD 'input';
B = GROUP A by $0;
....
set pig.maxCombinedSplitSize 134217728
F = ORDER E by $1;

  This should be more easy for the user, but code changes will be slightly more 
involved. Different parts of the plan will have to have different settings and 
needs to be carried over from logical->physical->mapreduce/tez plan and 
different optimizers to the execution engine correctly. Any other suggestions 
on making it simple for the user and also on the implementation?

> Different configurations for different stages of script
> -------------------------------------------------------
>
>                 Key: PIG-4424
>                 URL: https://issues.apache.org/jira/browse/PIG-4424
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Rohini Palaniswamy
>
> From a user:
> I have a pig script which runs multiple map reduce jobs. (Ex: 'group by' and 
> 'order by' which will be executed as 2 different map reduce jobs)
> Is there a way to specify different map reduce configuration options for 
> different stages instead of specifying them for the whole script (Ex: 
> different values for mapred.min.split.size for different stages)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to