I am having a problem getting Pig 0.7.0 to use a variable I add from a UDF.
Here's the basic pig script:
LOGS = LOAD '$INPUT' USING PigStorage('\t') ;
IMP_SID = FOREACH IMPRESSIONS_ONLY GENERATE *,
(($4 == 'NULL') ? null : (chararray)$4) AS my_id:chararray;
ORD_SID = FOREACH IMP_SID GENERATE *,
MY_LOOKUP(my_id, $2, $17) AS (
out1:int, out2:chararray, out3:chararray);
The input file is a tab-delimited file that has a variable number of columns
(but always more than 16). When I run this, I get the following error:
grunt> ORD_SID = FOREACH IMP_SID GENERATE *,
>> MY_LOOKUP(my_id, $2, $17) AS (
>> out1:int, out2:chararray, out3:chararray);
2010-06-11 21:45:04,652 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1000: Error during parsing. Invalid alias: my_id in null
Details at logfile: /tmp/pig_1276317170205.log
Why is the 'my_id' value an invalid alias?