Hi,

> There is lots of SequenceFile in HDFS, how can I merge them into one
> SequenceFile?

The simplest way to do that is to create a job that
- input format = sequence file
- map = identity mapper
- reduce = identity reduce
- output = sequence file
and
 job.setNumReduceTasks(1)

However: I think it is a useless thing to do.
Sequence files are only really useful inside a Hadoop cluster serving
as input for later jobs.
And having multiple files only helps Hadoop in scaling out.

So my question to you: Why do you want that?



-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Reply via email to