Curious : Seems like you could aggregate the results in the mapper as a local 
variable or list of strings--- is there a way to know that your mapper has just 
read the LAST line of an input split? 

I.e if so, then you could implement your entire solution in your mapper without 
needing a new input format z?

Is there a "cleanup" or "finalize" method in mappers that is run at the end of 
a whole steam read to support these sort of chunked, in memor map/r operations?

Jay Vyas 
MMSB
UCHC

On Apr 23, 2012, at 6:40 AM, Dan Drew <wirefr...@googlemail.com> wrote:

> I require each input file to be processed by each mapper as a whole.
> 
> I subclass c.o.a.h.mapreduce.lib.input.TextInputFormat and override
> isSplitable() to invariably return false.
> 
> The job is configured to use this subclass as the input format class via
> setInputFormatClass(). The job runs without error, yet the logs reveal
> files are still processed line by line by the mappers.
> 
> Any help would be greatly appreciated,
> Thanks

Reply via email to