I am using Pig 0.7 w/ stock Apache Hadoop 0.20.2. Works on both local
and mapreduce mode.

$ pig -d WARN test.pig
...
(c,x,,)

$ cat left_rel.txt
a       x
a       y
b       x
b       y
c       x

$ cat right_rel.txt
a       5
a       10
b       5
b       10

$ cat test.pig
A = LOAD 'left_rel.txt' AS (var1, var2);
B = LOAD 'right_rel.txt' AS (var1, var3);
C = JOIN A BY var1 LEFT OUTER, B BY var1;
D = FILTER C BY $2 is null;
DUMP D;

- Sandip


On Mon, Jun 7, 2010 at 11:18 PM, Dmitriy Ryaboy <[email protected]> wrote:
> I can reproduce this in 0.6, and it appears to have nothing to do with your
> data or with the DUMP operator -- a simple "explain" on D causes the same
> problem. Looks like there is something wrong with how the query plan gets
> compiled:
>
> Caused by: java.lang.NullPointerException
>        at org.apache.pig.impl.plan.OperatorPlan.add(OperatorPlan.java:152)
>        at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.generateStorePlan(QueryParser.java:128)
>        at org.apache.pig.PigServer.store(PigServer.java:552)
>        ... 7 more
>
>
> Haven't tried on 0.7
>
> -D
>
>
> On Mon, Jun 7, 2010 at 5:10 AM, Alexander Schätzle <
> [email protected]> wrote:
>
>> I exchanged the FILTER statement by a SPLIT:
>>
>> SPLIT C into D if var3 is null, E if var3 is not null;
>>
>> Now, this works!
>> Obviously there is a problem with null-values in the FILTER statement.
>> Does anybody know what's the problem?
>>
>> Cheers,
>> Alex
>>
>>
>>
>> ________________________________
>> Von: Rekha Joshi <[email protected]>
>> An: "[email protected]" <[email protected]>
>> Gesendet: Montag, den 7. Juni 2010, 10:22:19 Uhr
>> Betreff: Re: Unable to store alias
>>
>> Offhand I think its dump faulty behavior after join combined with datatype
>> misinterpretation, you can use store and that might work. However I would
>> try using a foreach generate stmt after C and then filter..
>>
>> D = foreach C generate $0 as fvar1, $1 as fvar2, (chararray)$2 as fvar3;
>> E = filter D by fvar3 is null;
>> Dump E; //verify result at null
>> E = filter D by fvar3 is not null;
>> Dump E; //Verify results for not null
>>
>> Cheers,
>> /R
>>
>> On 6/7/10 12:57 PM, "Alexander SchÀtzle" <[email protected]>
>> wrote:
>>
>> Hi all,
>>
>> my script looks like this:
>>
>> A = LOAD 'left_rel.txt' AS (var1, var2);
>> B = LOAD 'right_rel.txt' AS (var1, var3);
>> C = JOIN A BY var1 LEFT OUTER, B BY var1;
>> D = FILTER C BY $2 is null;
>> DUMP D;
>>
>> But when I dump D I get the error "Unable to store alias D".
>> I suppose there is something going wrong with the Filter vor null-values
>> (is not null also doesn't work).
>> What I want to do is to filter for the tuples in A which do not find a Join
>> partner in B
>> Input files are attached.
>>
>> Does anybody know what's going on and how to fix this?
>> By the way, I'm using Cloudera Distribution for Hadoop 3 Beta with pig
>> 0.5.0.
>>
>> Thx in advance,
>> Alex
>>
>>
>



-- 
http://www.pedalogue.com

Reply via email to