Re: SequenceFile with one very large value

2011-12-05 Thread Praveen Sripati
>> SequenceFiles place sync markers (similar to what 'newlines' mean in text files) after a bunch of records, and that is the reason why your record does not split when read. Sync is placed after every N records and is used for moving from an arbitrary location in a file to a start of the next re

Re: SequenceFile with one very large value

2011-12-04 Thread Harsh J
Florin, Based on the SequenceFileInputFormat's splitting, you should see just one task reading the record. SequenceFiles place sync markers (similar to what 'newlines' mean in text files) after a bunch of records, and that is the reason why your record does not split when read. Also worth thinki