here is a more basic script that reproduces what I am talking about... you
will see that dumping OUT works fine, but dumping OUT2 gives me a
java.lang.ClassCastException: java.lang.Integer cannot be cast to
org.apache.pig.data.Tuple
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276)
-----------
my_data = LOAD 'test.txt' using PigStorage(',')
as (name:chararray, age:int, eye_color:chararray, height:int);
by_age_and_color = GROUP my_data BY (age, eye_color);
-- dump by_age_and_color;
OUT = FOREACH by_age_and_color generate group.age;
dump OUT
OUT2 = FILTER by_age_and_color by group.age is not null;
dump OUT2
-----------
I get a similar problem even if I do something like:
OUT2 = FILTER by_age_and_color by group.age > 9;
dump OUT2
--------- sample test.txt ---------
ravi,33,blue,43
brendan,33,green,53
ravichandra,15,blue,43
leonor,15,brown,46
caeser,18,blue,23
JCVD,,blue,23
anthony,33,blue,46
xavier,23,blue,13
patrick,18,blue,33
sang,33,brown,44
On Fri, May 20, 2011 at 3:28 PM, Daniel Dai <[email protected]> wrote:
> It seems the stack does not match your statement. Do have another filter
> which use "not" and "is null" in your script?
>
> Daniel
>
>
> On 05/20/2011 12:22 PM, Daniel Eklund wrote:
>
>> If I can access the implicit 'group' column from within FOREACH like this:
>>
>> GROUPED = GROUP InputRelVar by (firstDim,secondDim);
>> B = FOREACH GROUPED GENERATE group.firstDim;
>>
>> ... then should I not be able to do something like this?
>>
>> B1 = FILTER GROUPED by group.firstDim == 'something';
>>
>> I get messages like this:
>> java.lang.ClassCastException: java.lang.String cannot be cast to
>> org.apache.pig.data.Tuple
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POIsNull.getNext(POIsNull.java:72)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>>
>> Interestingly I can use the 'group' alias overall like
>> B2 = FILTER GROUPED by group is not null;
>>
>>
>> Any explanations of what I am doing incorrect here?
>>
>> thanks,
>> daniel
>>
>
>