On second thought, probably A itself is NULL - in which case you will need a null check on A, and not on A.v (which, I think, is handled iirc).


Regards,
Mridul

On Tuesday 09 February 2010 04:02 PM, Mridul Muralidharan wrote:

Without knowing rest of the script, you could do something like :

C = FOREACH B {
    X = FILTER A BY v IS NOT NULL;
    GENERATE group, (int)AVG(X) as statsavg;
};

I am assuming it is cos there are nulls in your bag field.

Regards,
Mridul


On Tuesday 09 February 2010 03:52 PM, Alex Parvulescu wrote:
Hello,

I ran into a NPE today, which seems to be my fault, but I'm wondering if
there anythig that could be done to make the error more clear.

What I did it is:
   'C = FOREACH B GENERATE group, (int)AVG(A.v) as statsavg;'
The problem here is the AVG ran into some null values and returned null. And
consequently the cast failed with a NPE.

This is the stacktrace
2010-02-09 11:14:36,444 [Thread-85] WARN
org.apache.hadoop.mapred.LocalJobRunner - job_local_0006
java.lang.NullPointerException
      at org.apache.pig.builtin.IntAvg.getValue(IntAvg.java:282)
      at org.apache.pig.builtin.IntAvg.getValue(IntAvg.java:39)
      at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:208)
      at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:281)
      at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:182)
      at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:352)
      at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:277)
      at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:423)
      at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:391)
      at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:371)
      at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:239)
      at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
      at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
      at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:215)

Now, because I'm not well aware how this works, I did not realize that the
cast throws the NPE and not the computation of the average function on null
values provided by the data set.
I initially thought this was a bug in Pig.

I know the NPE is all on me, but is there anything you can do to improve the
error message

thanks,
alex


Reply via email to