Hi Anna, Just to second Bejoy's comments, that's an approach that I used successfully on a project a year or two ago. Plan on a day or two to get the port fully working and tested on your cluster. Once you start porting in CombineFileInputFormat, you'll probably find that you need to start porting in additional classes that it depends on. (I'm sorry that I don't have access to my port of the code anymore, so I can't just hand it over.)
Also, make sure that whatever version you port from includes the fix for the infinite loop bug. Here are 2 old JIRAs that tracked patches to fix the infinite loop: https://issues.apache.org/jira/browse/MAPREDUCE-2185 https://issues.apache.org/jira/browse/MAPREDUCE-2862 Thank you, --Chris On Thu, Sep 27, 2012 at 1:53 PM, Bejoy Ks <[email protected]> wrote: > Hi Anna > > One option I can think of is getting the CombineFileInputFormat from the > latest release add it as a Custom Input format in your application code and > ship it with your map reduce appl jar. Similar to how you'll implement a > input format of your own and use it with map reduce. > > Regards > Bejoy KS >
