Hey Manoj, I find the asker name here quite strange, although it is the same question, ha: http://stackoverflow.com/questions/10380200/how-to-use-combinefileinputformat-in-hadoop
Anyhow, here's one example: http://blog.yetitrails.com/2011/04/dealing-with-lots-of-small-files-in.html On Thu, Jul 12, 2012 at 8:33 PM, Manoj Babu <manoj...@gmail.com> wrote: > Gentles, > > I want to use the CombineFileInputFormat of Hadoop 0.20.0 / 0.20.2 such that > it processes 1 file per record and also doesn't compromise on data - > locality (which it normally takes care of). > > It is mentioned in Tom White's Hadoop Definitive Guide but he has not shown > how to do it. Instead, he moves on to Sequence Files. > > I am pretty confused on what is the meaning of processed variable in a > record reader. Any code example would be of tremendous help. > > Thanks in advance.. > > Cheers! > Manoj. > -- Harsh J