Alex, It's a hack (sort of) but here's how I always do it. Since parsing json in java will put you in an insane asylum:
Write a map only wukong script that parses the json as you want it. See the example here: http://thedatachef.blogspot.com/2011/01/processing-json-records-with-hadoop-and.html then use the STREAM operator to stream your raw records (load them as chararrays first) through your wukong script. It's not perfect but it gets the job done. --jacob @thedatachef On Sat, 2011-01-29 at 12:12 +0000, Alex McLintock wrote: > I wonder if discussion of the Piggybank and other User Defined Fields is > best done here (since it is *using* Pig) or on the Development list (because > it is enhancing Pig). > > I'm trying to load some Json into pig using the PigJsonLoader.java UDF which > Kim Vogt posted about back in September. (It isn't in Piggybank AFAICS) > https://gist.github.com/601331 > > > The class works for me - mostly.... > > > This works when the Json is just a single level > > {"field1": "value1", "field2": "value2", "field3": "value3"} > > But doesn't seem to work when the json is nested > > {"field1": "value1", "field2": "value2", {"field4": "value4", "field5": > "value5", "field6": "value6"}, "field3": "value3"} > > Has anyone got this working? I can't see how the existing code deals with > this. > parseStringToTuple only creates a single Map. There is no recursion I can > see. > > > > Any suggestions?
