produce a large sequencefile (1TB)

Jerry Lam Mon, 19 Aug 2013 15:10:17 -0700

Hi Hadoop users and developers,

I have a use case that I need produce a large sequence file of 1 TB in size
when each datanode has  200GB of storage but I have 30 datanodes.


The problem is that no single reducer can hold 1TB of data during the
reduce phase to generate a single sequence file even I use aggressive
compression. Any datanode will run out of space since this is a single
reducer job.

Any comment and help is appreciated.

Jerry

produce a large sequencefile (1TB)

Reply via email to