Attached is a tiny testcase illustrating my problem. What I would like to know is how to filter by Pig datatype. e.g. something like: filtered = FILTER some_data BY some_variable IS_MAP_TYPE;
Can anyone advise if this can be accomplished with Pig? We have a field that is sometimes a 'map' sometimes a chararray. Doing something like the following statement fails, presumable because it's trying to a key-value lookup on something that's not a 'map'. -- json#'data' is sometimes a map, sometimes not. trivias = FOREACH data GENERATE json#'data'#'trivia' AS trivia:charray; This has come about from us working with JSON data with Pig via Elephant Bird's JsonLoader. Thanks, Lex.
([data#Woozle Wozzle,name#dave])
([data#{trivia=One-One was a racehorse},name#steve])
datas: {data: bytearray}
(Woozle Wozzle)
([trivia#One-One was a racehorse])
trivias_subset: {bytearray}
testcase.json
Description: application/json
