Hi, if you know the block size, you can calculate the offsets for your records. And write a custom record reader class to seek into your records.
Kai Am 05.07.2012 um 22:54 schrieb MJ Sam: > Hi, > > The input of my map reduce is a binary file with no record begin and > end marker. The only thing is that each record is a fixed 180bytes > size in the binary file. How do I make Hadoop to properly find the > record in the splits when a record overlap two splits. I was thinking > to make the splits size to be a multiple of 180 but was wondering if > there is anything else that I can do? Please note that my files are > not sequence file and just a custom binary file. > -- Kai Voigt k...@123.org