Patch submitted. Pretty trivial, hopefully it's an adequate fix... 2011/6/17 Jonathan Coveney <jcove...@gmail.com>
> I made a Jira. > > https://issues.apache.org/jira/browse/PIG-2132 > > Should be pretty easy to fix. I'll probably do so over the weekend if > nobody else gets to it first. > > > 2011/6/17 Alan Gates <ga...@yahoo-inc.com> > >> MAX should definitely handle null, and it should ignore it. The goal for >> our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM, AVG) is to >> be SQL like. SQL ignores nulls in these functions. It's inconsistent, but >> it's usually what people. So, we should be consistently inconsistent like >> SQL. :) >> >> Alan. >> >> >> On Jun 16, 2011, at 1:07 PM, Daniel Dai wrote: >> >> I take back this after I saw Dmitriy's reply. Seems to be it is not that >>> straightforward. >>> >>> Daniel >>> >>> On 06/16/2011 01:00 PM, Daniel Dai wrote: >>> >>>> Yes, I think it is better if MAX can handle NULL. Can you open a Jira? >>>> >>>> Daniel >>>> >>>> On 06/16/2011 12:16 PM, Jonathan Coveney wrote: >>>> >>>>> Do we want the Max function to be able to handle nulls? Seems fairly >>>>> natural >>>>> for it to be able to. >>>>> >>>>> 2011/6/16 Daniel Dai<jiany...@yahoo-inc.com> >>>>> >>>>> Jonathan is right. math.MAX does not handle null input. Check for null >>>>>> before feeding into MAX is necessary. >>>>>> >>>>>> Daniel >>>>>> >>>>>> >>>>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote: >>>>>> >>>>>> Can you check if your rank2 or rank3 values are ever null? If they >>>>>>> are, >>>>>>> there are some ad hoc fixes which you can do until this is fixed (and >>>>>>> it >>>>>>> is >>>>>>> easy to fix, just a question of deciding what the desired handling of >>>>>>> null >>>>>>> values should be). I would just do something like... >>>>>>> >>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3); >>>>>>> B = FILTER A BY rank2 is null AND rank3 is null; >>>>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as >>>>>>> rank2, ( >>>>>>> rank3 is null ? rank2 : rank3 ) as rank3; >>>>>>> >>>>>>> Obvoiusly you could tweak that for whatever you want to happen if a >>>>>>> value >>>>>>> is >>>>>>> null. >>>>>>> >>>>>>> 2011/6/16 Jonathan Coveney<jcove...@gmail.com> >>>>>>> >>>>>>> Hm, just to make sure, I ran this against trunk (to see if it's just >>>>>>> a >>>>>>> >>>>>>>> 0.7.0 thing or not). >>>>>>>> >>>>>>>> A = LOAD 'test.txt'; --this is just a blank one line file >>>>>>>> B = FOREACH A GENERATE >>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX(1,null); >>>>>>>> >>>>>>>> I also tested fedding it files from test.txt etc. It fails when >>>>>>>> there is >>>>>>>> a >>>>>>>> null value. The cast does not. >>>>>>>> >>>>>>>> 2011/6/16 Lakshminarayana >>>>>>>> Motamarri<narayana.gupta123@****gmail.com<http://gmail.com> >>>>>>>> <narayana.gupta123@**gmail.com <narayana.gupta...@gmail.com>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>>> *I am receiving the following exception:* >>>>>>>>> org.apache.pig.backend.****executionengine.ExecException: ERROR >>>>>>>>> 2078: >>>>>>>>> Caught >>>>>>>>> error from UDF: org.apache.pig.piggybank.**** >>>>>>>>> evaluation.math.DoubleMax >>>>>>>>> [Caught >>>>>>>>> exception processing input row [null]] >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> physicalLayer.** >>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.** >>>>>>>>> **java:229) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> physicalLayer.** >>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.** >>>>>>>>> **java:263) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> physicalLayer.** >>>>>>>>> relationalOperators.POForEach.****processPlan(POForEach.java:*** >>>>>>>>> *269) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> physicalLayer.** >>>>>>>>> relationalOperators.POForEach.****getNext(POForEach.java:204) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> mapReduceLayer.PigMapBase.****runPipeline(PigMapBase.java:****249) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> mapReduceLayer.PigMapBase.map(****PigMapBase.java:240) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> mapReduceLayer.PigMapOnly$Map.****map(PigMapOnly.java:65) >>>>>>>>> at org.apache.hadoop.mapred.****MapRunner.run(MapRunner.java:** >>>>>>>>> **50) >>>>>>>>> at org.apache.hadoop.mapred.****MapTask.runOldMapper(MapTask.** >>>>>>>>> ** >>>>>>>>> java:358) >>>>>>>>> at org.apache.hadoop.mapred.****MapTask.run(MapTask.java:307) >>>>>>>>> at org.apache.hadoop.mapred.****Child.main(Child.java:170) >>>>>>>>> Caused by: java.io.IOException: Caught exception processing input >>>>>>>>> row >>>>>>>>> [null] >>>>>>>>> at >>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.** >>>>>>>>> exec(DoubleMax.java:70) >>>>>>>>> at >>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.** >>>>>>>>> exec(DoubleMax.java:57) >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>>> physicalLayer.** >>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.** >>>>>>>>> **java:201) >>>>>>>>> ... 10 more >>>>>>>>> Caused by: java.lang.NullPointerException >>>>>>>>> ... 13 more >>>>>>>>> >>>>>>>>> *My Code:* >>>>>>>>> *FFW2 = Load 'final_free_w2.txt'; >>>>>>>>> FFW3 = Load 'final_free_w3.txt'; >>>>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3; >>>>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3; >>>>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0; >>>>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5; >>>>>>>>> STORE FF23_Filtered INTO 'FF23_Filtered.txt'; >>>>>>>>> >>>>>>>>> REGISTER >>>>>>>>> /home/training/Desktop/1pig/****pig-0.7.0/contrib/piggybank/** >>>>>>>>> piggybank.jar >>>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3); >>>>>>>>> B = FOREACH A GENERATE appID, >>>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX((double)**** >>>>>>>>> rank2, >>>>>>>>> (double)rank3); >>>>>>>>> store B into 'FF23_FJM.txt'; * >>>>>>>>> >>>>>>>>> >>>>>>>>> --> Can any one pls let me know, what is the exact reason which >>>>>>>>> is >>>>>>>>> causing >>>>>>>>> above exception... >>>>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL. >>>>>>>>> >>>>>>>>> --- >>>>>>>>> Thanks& Regards, >>>>>>>>> Narayan. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>> >> >