Attached is a tiny testcase illustrating my problem.

What I would like to know is how to filter by Pig datatype.
e.g. something like:
filtered = FILTER some_data BY some_variable IS_MAP_TYPE;

Can anyone advise if this can be accomplished with Pig?

We have a field that is sometimes a 'map' sometimes a chararray.

Doing something like the following statement fails, presumable because it's
trying to a key-value lookup on something that's not a 'map'.

-- json#'data' is sometimes a map, sometimes not.
trivias = FOREACH data GENERATE json#'data'#'trivia' AS trivia:charray;

This has come about from us working with JSON data with Pig via Elephant
Bird's JsonLoader.

Thanks,

Lex.
([data#Woozle Wozzle,name#dave])
([data#{trivia=One-One was a racehorse},name#steve])
datas: {data: bytearray}
(Woozle Wozzle)
([trivia#One-One was a racehorse])
trivias_subset: {bytearray}

Attachment: testcase.json
Description: application/json

Reply via email to