I don't think one parser will work for all solution. It really depends on your data, since there might be a list within a list.
But pick anyone as a starting point and customize it for your own json data format. On Tue, Apr 19, 2011 at 3:00 PM, Alan Gates <[email protected]> wrote: > > On Apr 19, 2011, at 11:44 AM, Daniel Eklund wrote: > > <snip> >> >> A quick question about the UDF's registered at the top of a pig script: >> >> does >> REGISTER myJar.jar >> distribute the jar across HDFS (like a Hadoop job jar) so that the >> distribution of the code to the cluster nodes is transparent? >> In other words, do we NOT have to distribute myJar.jar to each node on the >> cluster. >> > > Pig takes care of getting myJar.jar to the task nodes; you do not have to > worry about it. > > Alan. > >
