No worries Ankur!

I closed the issue as duplicate, and linked it to the other one

thanks,
alex

On Thu, Feb 11, 2010 at 11:14 AM, Ankur C. Goel <[email protected]>wrote:

> Hey Alex,
>           No problem. Please go ahead and close it as duplicate.
> Sorry if I came across as being rude (did not mean to)
>
> Regards
> -...@nkur
>
>
> On 2/11/10 3:32 PM, "Alex Parvulescu" <[email protected]> wrote:
>
> Ok, my bad
>
> Should I close this one now?
>
> alex
>
> On Thu, Feb 11, 2010 at 9:41 AM, Ankur C. Goel <[email protected]>
> wrote:
>
> > There is already a JIRA (with patch) opened for this -
> > http://issues.apache.org/jira/browse/PIG-1233
> >
> > -...@nkur
> >
> > On 2/11/10 2:01 PM, "Alex Parvulescu" <[email protected]> wrote:
> >
> > Hello,
> >
> > thanks Dmitriy!
> >
> > Wow how could I have missed that one? seems easy enough: AVG( val == null
> ?
> > 0 : val)
> > I'll give it a go asap :)
> >
> > Here is the Jira issue, I hope I got everything in there
> > https://issues.apache.org/jira/browse/PIG-1236
> >
> > thanks,
> > Alex
> >
> > On Tue, Feb 9, 2010 at 6:04 PM, Dmitriy Ryaboy <[email protected]>
> wrote:
> >
> > > This is a legit bug, I think, in the new accumulator interface
> > > implementation. Nice find, Alex. Can you open a jira?
> > >
> > > btw, I saw on your blog you had some issues with how pig was ignoring
> > > nulls when calculating average values before (this is documented and
> > > expected behavior btw), and wound up writing your own. You don't
> > > really need to:
> > >
> > > averages = foreach A generate AVG( val == null ? 0 : val);
> > >
> > >
> > > On Tue, Feb 9, 2010 at 2:57 AM, Mridul Muralidharan
> > > <[email protected]> wrote:
> > > >
> > > > Someone from pig team can answer better if there is any impl issues
> > here
> > > > with average.
> > > > But assuming there are none, if you can treat null's as zeros - you
> > could
> > > > add additional checks to the statements, to allow it to proceed.
> > > >
> > > > Something to check for :
> > > > a) If A == null, generate 0.
> > > > b) If A.v == null, generate 0. (This is a strong possibility too).
> > > >
> > > >
> > > > Regards,
> > > > Mridul
> > > >
> > > > On Tuesday 09 February 2010 04:08 PM, Alex Parvulescu wrote:
> > > >>
> > > >> hello Mridul,
> > > >>
> > > >> and thanks for the quick answer!
> > > >>
> > > >> A itself is not null, just some group by values. I can't drop the
> > nulls
> > > >> because I also need a count in the group by, even if it's only null
> > > >> values.
> > > >>
> > > >> I just wandered if theres anything to be done about the NPE to make
> it
> > > >> more clear, that's all.
> > > >>
> > > >> I guess you can see this as an eventual feature / improvement of
> some
> > > >> sort, no problems :)
> > > >>
> > > >> alex
> > > >>
> > > >> On Tue, Feb 9, 2010 at 11:35 AM, Mridul Muralidharan
> > > >> <[email protected] <mailto:[email protected]>> wrote:
> > > >>
> > > >>
> > > >>    On second thought, probably A itself is NULL - in which case you
> > > >>    will need a null check on A, and not on A.v (which, I think, is
> > > >>    handled iirc).
> > > >>
> > > >>
> > > >>    Regards,
> > > >>    Mridul
> > > >>
> > > >>
> > > >>    On Tuesday 09 February 2010 04:02 PM, Mridul Muralidharan wrote:
> > > >>
> > > >>
> > > >>        Without knowing rest of the script, you could do something
> like
> > :
> > > >>
> > > >>        C = FOREACH B {
> > > >>            X = FILTER A BY v IS NOT NULL;
> > > >>            GENERATE group, (int)AVG(X) as statsavg;
> > > >>        };
> > > >>
> > > >>        I am assuming it is cos there are nulls in your bag field.
> > > >>
> > > >>        Regards,
> > > >>        Mridul
> > > >>
> > > >>
> > > >>        On Tuesday 09 February 2010 03:52 PM, Alex Parvulescu wrote:
> > > >>
> > > >>            Hello,
> > > >>
> > > >>            I ran into a NPE today, which seems to be my fault, but
> I'm
> > > >>            wondering if
> > > >>            there anythig that could be done to make the error more
> > > clear.
> > > >>
> > > >>            What I did it is:
> > > >>            'C = FOREACH B GENERATE group, (int)AVG(A.v) as
> statsavg;'
> > > >>            The problem here is the AVG ran into some null values and
> > > >>            returned null. And
> > > >>            consequently the cast failed with a NPE.
> > > >>
> > > >>            This is the stacktrace
> > > >>            2010-02-09 11:14:36,444 [Thread-85] WARN
> > > >>            org.apache.hadoop.mapred.LocalJobRunner - job_local_0006
> > > >>            java.lang.NullPointerException
> > > >>                  at
> > > >> org.apache.pig.builtin.IntAvg.getValue(IntAvg.java:282)
> > > >>                  at
> > > org.apache.pig.builtin.IntAvg.getValue(IntAvg.java:39)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:208)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:281)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:182)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:352)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:277)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:423)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:391)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:371)
> > > >>                  at
> > > >>
> > > >>
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:239)
> > > >>                  at
> > > >>
> > > >>
> >  org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
> > > >>                  at
> > > >>
> >  org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > >>                  at
> > > >>
> > > >>
> > >
>  org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:215)
> > > >>
> > > >>            Now, because I'm not well aware how this works, I did not
> > > >>            realize that the
> > > >>            cast throws the NPE and not the computation of the
> average
> > > >>            function on null
> > > >>            values provided by the data set.
> > > >>            I initially thought this was a bug in Pig.
> > > >>
> > > >>            I know the NPE is all on me, but is there anything you
> can
> > > >>            do to improve the
> > > >>            error message
> > > >>
> > > >>            thanks,
> > > >>            alex
> > > >>
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > >
> >
> >
>
>

Reply via email to