You can:

1. load all data,
2. use strsplit (http://pig.apache.org/docs/r0.9.2/func.html#strsplit) to
split your values into a tuple
3. convert your tuples into a bag (I used an UDF in python instead DF tobag
)
4. flatten your bag (http://pig.apache.org/docs/r0.9.2/basic.html#flatten)

I don't know if the best way, but it works.

[]s,
Sisso

2012/3/9 Prashant Kommireddi <[email protected]>

> I'm not sure of an inbuilt way in Pig to ignore keys. May be you can
> load the data as comma delimited and parse out all characters before
> tab inclusive in a foreach Statement from the first field. You can use
> tokenize or substring to achieve that.
>
> May be there is a better way I'm not aware.
>
> Sent from my iPhone
>
> On Mar 8, 2012, at 7:53 PM, Mohit Anchlia <[email protected]> wrote:
>
> > I have something like:
> >
> > ABC    1,2,3,4
> >
> > I think it's the tab delimited.with ABC being the key and 1,2,3,4 as
> values.
> >
> > I need to ignore ABC and then load with PigStorage(',') to parse comma
> > separated into separate fields. Is there an easy way to do this?
> >
> > On Thu, Mar 8, 2012 at 6:20 PM, Prashant Kommireddi <[email protected]
> >wrote:
> >
> >> How are you loading it in Pig? Can you just ignore the first field (key)
> >> with positional reference? What is the key-value delimiter used in your
> MR
> >> job.
> >>
> >> On Thu, Mar 8, 2012 at 2:56 PM, Mohit Anchlia <[email protected]
> >>> wrote:
> >>
> >>> I am trying to process the output which has key in it from the
> map-reduce
> >>> job. Is there a way I can ignore the key when I load data from that
> file?
> >>> When I load data in the variable I don't want the key in that alias.
> >>>
> >>
>

Reply via email to