Thanks Dmitriy, I just needed a sanity check! I've essentially done
the same thing as you describe, create a UDF to do the conversion but
of course it would be nice to not have to do that. I assume that other
people (like you and the other Twitter folks) are then working with
JSON in Pig by reading in JSON in the first place and never building
it in Pig as you go?

I think building Maps would be a nice language feature so I'll log it
as an issue.

Cheers,

Josh



On 2010-11-26, at 11:39 PM, Dmitriy Ryaboy <[email protected]> wrote:

> I don't think we've considered building out Maps in Pig this way. You can of
> course run your data through a UDF that would take a tuple whose first
> argument is a list of key names, and invoke it like so:
>
> jsonStore = FOREACH thing GENERATE
>  toMap('id foo bar', *) AS json:map[];
>
> -D
>
> On Thu, Nov 25, 2010 at 11:53 AM, Josh Devins <[email protected]> wrote:
>
>> Hi all,
>>
>> I have a a simple schema that I want to store as JSON. So I've written a
>> simple JsonStorage class but it requires that the tuple's first field is a
>> map. The problem is in converting a regular tuple into a map:
>>
>> DESCRIBE thing;
>>> thing: {id: chararray,field1: chararray,field2: chararray}
>>
>> What the map/JSON should look like:
>> { 'id': 'id0', 'foo': 'valueFromField1', 'bar': 'valueFromField2' }
>>
>> So this should work but seems to be invalid syntax:
>> jsonStore = FOREACH thing GENERATE
>>   [ 'id'#id, 'foo'#field1, 'bar'#field2 ] AS json:map[];
>>
>> ERROR 1000: Error during parsing. Encountered " "[" "[ "" at line 150,
>> column 23.
>> Was expecting one of:
>>   "flatten" ...
>>   "(" ...
>>   "-" ...
>>   "(" ...
>>   "(" ...
>>   "(" ...
>>   "(" ...
>>   "(" ...
>>
>> The only way I have this syntax working is if I use only constants in the
>> map:
>> jsonStore = FOREACH thing GENERATE
>>   [ 'id'#'const', 'foo'#'const', 'bar'#'const' ] AS json:map[];
>>
>> Is it possible to do what I'm thinking?
>>
>> Thanks,
>>
>> Josh
>>

Reply via email to