[ https://issues.apache.org/jira/browse/MAPREDUCE-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849320#action_12849320 ]
Pradeep Kamath commented on MAPREDUCE-1620: ------------------------------------------- Looking at MAPREDUCE-1486, that issue seems to deal with a similar issue in Map tasks - this issue is related to Configuration on the client side (during getSplits()) and the Configuration in the map task (in createRecordReader()) - so I think this is different. > Hadoop should serialize the Configration after the call to getSplits() to the > backend such that any changes to the Configuration in getSplits() is > serialized to the backend > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-1620 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1620 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.20.1, 0.20.2 > Reporter: Pradeep Kamath > > In 0.20.1 and 0.20.2, when using the new API, while working on the next pig > release we discovered that the hadoop code makes a copy of the Configuration > and hands a copy to the getSplits() call. Any changes to the Configuration > made in getSplits() are on that copy. However the original Configuraiton is > the one which gets serialized to the backend - hence any changes made to the > Configuration in the getSplits() implementation does not get serialized to > the backend. In a framework like Pig, there are usecases for writing > information into the Configuration during getSplits - it would be helpful if > Hadoop would ensure that these changes get serialized to the backend. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.