Hi,
The input of my map reduce is a binary file with no record begin and end marker. The only thing is that each record is a fixed 180bytes size in the binary file. How do I make Hadoop to properly find the record in the splits when a record overlap two splits. I was thinking to make the splits size to be a multiple of 180 but was wondering if there is anything else that I can do? Please note that my files are not sequence file and just a custom binary file.