MySQL has a function called "greatest" which does max of several values (as opposed to max, which is an aggregate function over a column). Here's what it returns:
select greatest(1, 2) 2 select greatest(1,null) null On the other hand, the max aggregate function returns 2 when a table column has 3 rows, with values (null, 1, 2). So much for consistency. So what's the answer here? I have no idea. Erroring on underspecified behaviors and letting users handle null cases as makes sense to them at least doesn't cause bizarre hard-to-find data bugs 12 hours into a 27-step computation. D On Thu, Jun 16, 2011 at 12:16 PM, Jonathan Coveney <jcove...@gmail.com> wrote: > Do we want the Max function to be able to handle nulls? Seems fairly natural > for it to be able to. > > 2011/6/16 Daniel Dai <jiany...@yahoo-inc.com> > >> Jonathan is right. math.MAX does not handle null input. Check for null >> before feeding into MAX is necessary. >> >> Daniel >> >> >> On 06/16/2011 06:45 AM, Jonathan Coveney wrote: >> >>> Can you check if your rank2 or rank3 values are ever null? If they are, >>> there are some ad hoc fixes which you can do until this is fixed (and it >>> is >>> easy to fix, just a question of deciding what the desired handling of null >>> values should be). I would just do something like... >>> >>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3); >>> B = FILTER A BY rank2 is null AND rank3 is null; >>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, ( >>> rank3 is null ? rank2 : rank3 ) as rank3; >>> >>> Obvoiusly you could tweak that for whatever you want to happen if a value >>> is >>> null. >>> >>> 2011/6/16 Jonathan Coveney<jcove...@gmail.com> >>> >>> Hm, just to make sure, I ran this against trunk (to see if it's just a >>>> 0.7.0 thing or not). >>>> >>>> A = LOAD 'test.txt'; --this is just a blank one line file >>>> B = FOREACH A GENERATE >>>> org.apache.pig.piggybank.**evaluation.math.MAX(1,null); >>>> >>>> I also tested fedding it files from test.txt etc. It fails when there is >>>> a >>>> null value. The cast does not. >>>> >>>> 2011/6/16 Lakshminarayana >>>> Motamarri<narayana.gupta123@**gmail.com<narayana.gupta...@gmail.com> >>>> > >>>> >>>> Hi all, >>>>> >>>>> *I am receiving the following exception:* >>>>> org.apache.pig.backend.**executionengine.ExecException: ERROR 2078: >>>>> Caught >>>>> error from UDF: org.apache.pig.piggybank.**evaluation.math.DoubleMax >>>>> [Caught >>>>> exception processing input row [null]] >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.** >>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:229) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.** >>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:263) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.** >>>>> relationalOperators.POForEach.**processPlan(POForEach.java:**269) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.** >>>>> relationalOperators.POForEach.**getNext(POForEach.java:204) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.** >>>>> mapReduceLayer.PigMapBase.**runPipeline(PigMapBase.java:**249) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.** >>>>> mapReduceLayer.PigMapBase.map(**PigMapBase.java:240) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.** >>>>> mapReduceLayer.PigMapOnly$Map.**map(PigMapOnly.java:65) >>>>> at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50) >>>>> at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.** >>>>> java:358) >>>>> at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307) >>>>> at org.apache.hadoop.mapred.**Child.main(Child.java:170) >>>>> Caused by: java.io.IOException: Caught exception processing input row >>>>> [null] >>>>> at >>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.** >>>>> exec(DoubleMax.java:70) >>>>> at >>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.** >>>>> exec(DoubleMax.java:57) >>>>> at >>>>> >>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.** >>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:201) >>>>> ... 10 more >>>>> Caused by: java.lang.NullPointerException >>>>> ... 13 more >>>>> >>>>> *My Code:* >>>>> *FFW2 = Load 'final_free_w2.txt'; >>>>> FFW3 = Load 'final_free_w3.txt'; >>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3; >>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3; >>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0; >>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5; >>>>> STORE FF23_Filtered INTO 'FF23_Filtered.txt'; >>>>> >>>>> REGISTER >>>>> /home/training/Desktop/1pig/**pig-0.7.0/contrib/piggybank/** >>>>> piggybank.jar >>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3); >>>>> B = FOREACH A GENERATE appID, >>>>> org.apache.pig.piggybank.**evaluation.math.MAX((double)**rank2, >>>>> (double)rank3); >>>>> store B into 'FF23_FJM.txt'; * >>>>> >>>>> >>>>> --> Can any one pls let me know, what is the exact reason which is >>>>> causing >>>>> above exception... >>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL. >>>>> >>>>> --- >>>>> Thanks& Regards, >>>>> Narayan. >>>>> >>>>> >>>> >> >