Thank you Bejoy and Chris! Fabulous idea that I will definitely use. And I really appreciate the tips to make it go a little smoother, as well.
On Thu, Sep 27, 2012 at 5:39 PM, Chris Nauroth <[email protected]>wrote: > Hi Anna, > > Just to second Bejoy's comments, that's an approach that I used > successfully on a project a year or two ago. Plan on a day or two to get > the port fully working and tested on your cluster. Once you start porting > in CombineFileInputFormat, you'll probably find that you need to start > porting in additional classes that it depends on. (I'm sorry that I don't > have access to my port of the code anymore, so I can't just hand it over.) > > Also, make sure that whatever version you port from includes the fix for > the infinite loop bug. Here are 2 old JIRAs that tracked patches to fix > the infinite loop: > > https://issues.apache.org/jira/browse/MAPREDUCE-2185 > > https://issues.apache.org/jira/browse/MAPREDUCE-2862 > > Thank you, > --Chris > > On Thu, Sep 27, 2012 at 1:53 PM, Bejoy Ks <[email protected]> wrote: > >> Hi Anna >> >> One option I can think of is getting the CombineFileInputFormat from the >> latest release add it as a Custom Input format in your application code and >> ship it with your map reduce appl jar. Similar to how you'll implement a >> input format of your own and use it with map reduce. >> >> Regards >> Bejoy KS >> > >
