Hi,

if you know the block size, you can calculate the offsets for your records. And 
write a custom record reader class to seek into your records.

Kai

Am 05.07.2012 um 22:54 schrieb MJ Sam:

> Hi,
> 
> The input of my map reduce is a binary file with no record begin and
> end marker. The only thing is that each record is a fixed 180bytes
> size in the binary file. How do I make Hadoop to properly find the
> record in the splits when a record overlap two splits. I was thinking
> to make the splits size to be a multiple of 180 but was wondering if
> there is anything else that I can do?  Please note that my files are
> not sequence file and just a custom binary file.
> 

-- 
Kai Voigt
k...@123.org




Reply via email to