On Fri, Dec 18, 2009 at 5:20 PM, Cao Kang <[email protected]> wrote:
> Is there any example how a sequence file can be read and split in hadoop?
> Many thanks!

That should be fairly easy. The following code reads all entries in a
sequence file:

        SequenceFile.Reader reader = new
SequenceFile.Reader(path.getFileSystem(config), path, config);

        Writable key = (Writable)reader.getKeyClass().newInstance();
        Writable value = (Writable)reader.getValueClass().newInstance();

        while(reader.next(key, value)) {
            System.out.println(key + "\t" + value);
        }

        reader.close();

Add some logic to partition the entries and write them out using a
SequenceFile.Writer.

-- 
Cheers,
-Ives

Reply via email to