[ https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated MAPREDUCE-4868: ------------------------------------- Fix Version/s: (was: 2.3.0) 2.4.0 > Allow multiple iteration for map > -------------------------------- > > Key: MAPREDUCE-4868 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 > Affects Versions: 3.0.0, 2.0.3-alpha > Reporter: Jerry Chen > Fix For: 3.0.0, 2.4.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently, the Mapper class allows advanced users to override "public void > run(Context context)" method for more control over the execution of the > mapper, while Context interface limit the operations over the data which is > the foundation of "more control". > One of use cases is that when I am considering a hive optimziation problem, I > want to go two passes over the input data instead of using a another job or > task ( which may slower the whole process). Each pass do the same thing but > with a different parameters. > This is a new paradigm of Map Reduce usage and can be archived easily by > extend Context interface a little with the more control over the data such as > reset the input. -- This message was sent by Atlassian JIRA (v6.1.5#6160)