Re: JSONToTuple for pig UDF

John Hui Tue, 19 Apr 2011 12:11:56 -0700

Really, cool.  Let me take a look when I have some "downtime".  If that's
the case, Xavier's parser is much better than mine.


Who wants to take the lead in adding this to the piggybank, I am sure this
makes for a very useful "storage" utility.

John

On Tue, Apr 19, 2011 at 3:09 PM, Xavier Stevens <[email protected]>wrote:

> Hey John,
>
> If you take a look at mine it looks explicitly for Lists and converts
> them to DataBags. I ran into that issue with our data. That said I won't
> make any claims that it'll work for all data.
>
> Cheers,
>
> -Xavier
>
> On 4/19/11 12:02 PM, John Hui wrote:
> > I'll post my solution in a few hours =)
> >
> > On Tue, Apr 19, 2011 at 3:02 PM, John Hui <[email protected]> wrote:
> >
> >> I don't think one parser will work for all solution.  It really depends
> on
> >> your data, since there might be a list within a list.
> >>
> >> But pick anyone as a starting point and customize it for your own json
> data
> >> format.
> >>
> >>
> >> On Tue, Apr 19, 2011 at 3:00 PM, Alan Gates <[email protected]>
> wrote:
> >>
> >>> On Apr 19, 2011, at 11:44 AM, Daniel Eklund wrote:
> >>>
> >>>  <snip>
> >>>> A quick question about the UDF's registered at the top of a pig
> script:
> >>>>
> >>>> does
> >>>> REGISTER myJar.jar
> >>>> distribute the jar across HDFS (like a Hadoop job jar) so that the
> >>>> distribution of the code to the cluster nodes is transparent?
> >>>> In other words, do we NOT have to distribute myJar.jar to each node on
> >>>> the
> >>>> cluster.
> >>>>
> >>> Pig takes care of getting myJar.jar to the task nodes; you do not have
> to
> >>> worry about it.
> >>>
> >>> Alan.
> >>>
> >>>
>

Re: JSONToTuple for pig UDF

Reply via email to