I don't understand why my purpose is not clear. The previous e-mails
explain it very clearly.  I want to split a 500MB single txt in HDFS into
multiple files using Pig latin. Is it possible? E.g.,

A = LOAD ‘myfile.txt’ USING PigStorage() AS (t);
STORE A INTO ‘multiplefiles’ USING PigStorage(); -- and here creates
multiple file with a specific size




On 10 June 2013 07:29, Bertrand Dechoux <[email protected]> wrote:

> The purpose is not really clear. But if you are looking for how to specify
> multiple Reducer task, it is well explained in the documentation.
> http://pig.apache.org/docs/r0.11.1/perf.html#parallel
>
> You will get one file per reducer. It is up to you to specify the right
> number but be careful of not falling into the small files problem in the
> end.
> http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
>
> If you have specific question on HDFS itself or pig optimisation, you
> should provide more explanation.
> (64MB is the default block size for HDFS)
>
> Regard
>
> Bertrand
>
>
> On Mon, Jun 10, 2013 at 6:53 AM, Pedro Sá da Costa <[email protected]
> >wrote:
>
> > I said 64MB, but it can be 128MB, or 5KB. It doesn't matter the number. I
> > just want to extract data and put into several files with specific size.
> > Basically, I am doing a cat to a big txt file, and I want to split the
> > content into multiple files with a fixed size.
> >
> >
> > On 7 June 2013 10:14, Johnny Zhang <[email protected]> wrote:
> >
> > > Pedro, you can try Piggybank MultiStorage, which split results into
> > > different dir/files by specific index attribute. But not sure how it
> can
> > > make sure the file size is 64MB. Why 64MB specifically? what's the
> > > connection between your data and 64MB?
> > >
> > > Johnny
> > >
> > >
> > > On Fri, Jun 7, 2013 at 12:56 AM, Pedro Sá da Costa <[email protected]
> > > >wrote:
> > >
> > > > I am using the instruction:
> > > >
> > > > store A into 'result-australia-0' using PigStorage('\t');
> > > >
> > > > to store the data in HDFS. But the problem is that, this creates 1
> file
> > > > with 500MB of size. Instead, want to save several 64MB files. How I
> do
> > > > this?
> > > >
> > > > --
> > > > Best regards,
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
>
>
>
> --
> Bertrand Dechoux
>



-- 
Best regards,

Reply via email to