[ https://issues.apache.org/jira/browse/PIG-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721570#action_12721570 ]
Olga Natkovich commented on PIG-856: ------------------------------------ Hi Milind, yes, these are very good points. I was hoping that we could set the flag for jobs that produce temparary results only but I did not clearly state this in the bug. I am also considering replication of 1 as I agree it should yield much better performance gains. My plan is to run a test on a large query (join + order by) with replication factor of 1, 2, and default and see what perf differences are. > PERFORMANCE: reduce number of replicas > -------------------------------------- > > Key: PIG-856 > URL: https://issues.apache.org/jira/browse/PIG-856 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.3.0 > Reporter: Olga Natkovich > > Currently Pig uses the default number of replicas between MR jobs. Currently, > the number is 3. Given the temp nature of the data, we should never need more > than 2 and should explicitely set it to improve performance and to be nicer > to the name node. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.