Patch submitted. Pretty trivial, hopefully it's an adequate fix...

2011/6/17 Jonathan Coveney <jcove...@gmail.com>

> I made a Jira.
>
> https://issues.apache.org/jira/browse/PIG-2132
>
> Should be pretty easy to fix. I'll probably do so over the weekend if
> nobody else gets to it first.
>
>
> 2011/6/17 Alan Gates <ga...@yahoo-inc.com>
>
>> MAX should definitely handle null, and it should ignore it.  The goal for
>> our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM, AVG) is to
>> be SQL like.  SQL ignores nulls in these functions.  It's inconsistent, but
>> it's usually what people.  So, we should be consistently inconsistent like
>> SQL. :)
>>
>> Alan.
>>
>>
>> On Jun 16, 2011, at 1:07 PM, Daniel Dai wrote:
>>
>>  I take back this after I saw Dmitriy's reply. Seems to be it is not that
>>> straightforward.
>>>
>>> Daniel
>>>
>>> On 06/16/2011 01:00 PM, Daniel Dai wrote:
>>>
>>>> Yes, I think it is better if MAX can handle NULL. Can you open a Jira?
>>>>
>>>> Daniel
>>>>
>>>> On 06/16/2011 12:16 PM, Jonathan Coveney wrote:
>>>>
>>>>> Do we want the Max function to be able to handle nulls? Seems fairly
>>>>> natural
>>>>> for it to be able to.
>>>>>
>>>>> 2011/6/16 Daniel Dai<jiany...@yahoo-inc.com>
>>>>>
>>>>>  Jonathan is right. math.MAX does not handle null input. Check for null
>>>>>> before feeding into MAX is necessary.
>>>>>>
>>>>>> Daniel
>>>>>>
>>>>>>
>>>>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>>>>>
>>>>>>  Can you check if your rank2 or rank3 values are ever null? If they
>>>>>>> are,
>>>>>>> there are some ad hoc fixes which you can do until this is fixed (and
>>>>>>> it
>>>>>>> is
>>>>>>> easy to fix, just a question of deciding what the desired handling of
>>>>>>> null
>>>>>>> values should be). I would just do something like...
>>>>>>>
>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>>>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as
>>>>>>> rank2, (
>>>>>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>>>>>
>>>>>>> Obvoiusly you could tweak that for whatever you want to happen if a
>>>>>>> value
>>>>>>> is
>>>>>>> null.
>>>>>>>
>>>>>>> 2011/6/16 Jonathan Coveney<jcove...@gmail.com>
>>>>>>>
>>>>>>>  Hm, just to make sure, I ran this against trunk (to see if it's just
>>>>>>> a
>>>>>>>
>>>>>>>> 0.7.0 thing or not).
>>>>>>>>
>>>>>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>>>>>> B = FOREACH A GENERATE
>>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX(1,null);
>>>>>>>>
>>>>>>>> I also tested fedding it files from test.txt etc. It fails when
>>>>>>>> there is
>>>>>>>> a
>>>>>>>> null value. The cast does not.
>>>>>>>>
>>>>>>>> 2011/6/16 Lakshminarayana 
>>>>>>>> Motamarri<narayana.gupta123@****gmail.com<http://gmail.com>
>>>>>>>> <narayana.gupta123@**gmail.com <narayana.gupta...@gmail.com>>
>>>>>>>>  Hi all,
>>>>>>>>
>>>>>>>>> *I am receiving the following exception:*
>>>>>>>>> org.apache.pig.backend.****executionengine.ExecException: ERROR
>>>>>>>>> 2078:
>>>>>>>>> Caught
>>>>>>>>> error from UDF: org.apache.pig.piggybank.****
>>>>>>>>> evaluation.math.DoubleMax
>>>>>>>>> [Caught
>>>>>>>>> exception processing input row  [null]]
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.**
>>>>>>>>> **java:229)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.**
>>>>>>>>> **java:263)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> relationalOperators.POForEach.****processPlan(POForEach.java:***
>>>>>>>>> *269)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> relationalOperators.POForEach.****getNext(POForEach.java:204)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> mapReduceLayer.PigMapBase.****runPipeline(PigMapBase.java:****249)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> mapReduceLayer.PigMapBase.map(****PigMapBase.java:240)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> mapReduceLayer.PigMapOnly$Map.****map(PigMapOnly.java:65)
>>>>>>>>>    at org.apache.hadoop.mapred.****MapRunner.run(MapRunner.java:**
>>>>>>>>> **50)
>>>>>>>>>    at org.apache.hadoop.mapred.****MapTask.runOldMapper(MapTask.**
>>>>>>>>> **
>>>>>>>>> java:358)
>>>>>>>>>    at org.apache.hadoop.mapred.****MapTask.run(MapTask.java:307)
>>>>>>>>>    at org.apache.hadoop.mapred.****Child.main(Child.java:170)
>>>>>>>>> Caused by: java.io.IOException: Caught exception processing input
>>>>>>>>> row
>>>>>>>>> [null]
>>>>>>>>>    at
>>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.**
>>>>>>>>> exec(DoubleMax.java:70)
>>>>>>>>>    at
>>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.**
>>>>>>>>> exec(DoubleMax.java:57)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.**
>>>>>>>>> **java:201)
>>>>>>>>>    ... 10 more
>>>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>>>    ... 13 more
>>>>>>>>>
>>>>>>>>> *My Code:*
>>>>>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>>>>>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>>>>>
>>>>>>>>>    REGISTER
>>>>>>>>> /home/training/Desktop/1pig/****pig-0.7.0/contrib/piggybank/**
>>>>>>>>> piggybank.jar
>>>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>>>>> B = FOREACH A GENERATE appID,
>>>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX((double)****
>>>>>>>>> rank2,
>>>>>>>>> (double)rank3);
>>>>>>>>> store B into 'FF23_FJM.txt'; *
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -->    Can any one pls let me know, what is the exact reason which
>>>>>>>>> is
>>>>>>>>> causing
>>>>>>>>> above exception...
>>>>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Thanks&    Regards,
>>>>>>>>> Narayan.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>
>>
>

Reply via email to