Offhand I think its dump faulty behavior after join combined with datatype misinterpretation, you can use store and that might work. However I would try using a foreach generate stmt after C and then filter..
D = foreach C generate $0 as fvar1, $1 as fvar2, (chararray)$2 as fvar3; E = filter D by fvar3 is null; Dump E; //verify result at null E = filter D by fvar3 is not null; Dump E; //Verify results for not null Cheers, /R On 6/7/10 12:57 PM, "Alexander SchÀtzle" <[email protected]> wrote: Hi all, my script looks like this: A = LOAD 'left_rel.txt' AS (var1, var2); B = LOAD 'right_rel.txt' AS (var1, var3); C = JOIN A BY var1 LEFT OUTER, B BY var1; D = FILTER C BY $2 is null; DUMP D; But when I dump D I get the error "Unable to store alias D". I suppose there is something going wrong with the Filter vor null-values (is not null also doesn't work). What I want to do is to filter for the tuples in A which do not find a Join partner in B Input files are attached. Does anybody know what's going on and how to fix this? By the way, I'm using Cloudera Distribution for Hadoop 3 Beta with pig 0.5.0. Thx in advance, Alex
