Can I set mappers to use overlapping record ranges from a sequence file?

Rob Podolski Tue, 29 Nov 2011 22:29:31 -0800

Hi

I have a computation to do for a large input - a single large sequence file.  
Ideally I would like to set a specific number of mappers and designate each to 
process over a specific range of records in the input sequence file.  For 
various reasons, the record ranges that I would want to pass to each mapper 
would be over-lapping (e.g. mapper 1 record ranges 1 - 1000, mapper 2 record 
ranges 700 - 2000 etc).


Is it possible to do this? If so how would I go about it?  InputFormat does not 
seem to cater for this.  Perhaps Hadoop might not be the right 'parallel' 
framework for me to do this in.  

thnks.

Can I set mappers to use overlapping record ranges from a sequence file?

Reply via email to