Re: How to store each record in a seperate file

Thomas Kappler Thu, 13 Oct 2011 00:03:59 -0700

On Thu, Oct 13, 2011 at 07:56, Ayon Sinha <[email protected]> wrote:
> Hi Kiranprasad,
> What is your usecase? Are you sure you have picked the right tool for the 
> job? Pig/Hadoop is meant for massive datasets which mean millions and 
> billions of rows. Which in your case would lead to millions & billions of 
> files which Hadoop doesn't like anyway.


I have also found that MultiStorage runs a reducer for each partition,
i.e., each separate file. This will be ok if for a small number of
partitions (locations in Kiran's case), but will break down for larger
numbers.

I ended up letting Pig group the records and writing a script that
splits the Pig output into one file per group.

-- Thomas

Re: How to store each record in a seperate file

Reply via email to