I've been using twitter's elephantbird and have been very happy with it so far. Here's an example of parsing a nested json with it:
json_eb = LOAD '$IN_DIRS' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') as (json:map[]); --parse json with twitter's library parsed0 = FOREACH json_eb GENERATE STRSPLIT(json#'id',':').$2 AS tweetId:chararray, STRSPLIT(json#'actor'#'id',':').$2 AS userId:chararray, json#'postedTime' AS postedTime:chararray, json#'twitter_entities'#'urls' AS userPostedLinks:bag{T:(urlTypes:map[])}; On Wed, May 22, 2013 at 10:01 AM, Thomas Edison <justdoit.thomas.edi...@gmail.com> wrote: > Hi all, > > I have a two fields in my pig input file. Let's say product_id and > description. Description is a JSON objects that actually describes the > product. > > Is there anything in Pig other than writing a custom UDF to parse the JSON > object so that I can have some like product_id, product_property, > product_property_value? Product_property and product_value are parsed from > the description JSON object. Also one product could have multiple > product_property. > > Thanks. > > T.E.