[ 
https://issues.apache.org/jira/browse/PIG-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel resolved PIG-5230.
-----------------------------------
    Resolution: Fixed

> Fix the RuntimeException throws in SecondaryKeySortUtil
> -------------------------------------------------------
>
>                 Key: PIG-5230
>                 URL: https://issues.apache.org/jira/browse/PIG-5230
>             Project: Pig
>          Issue Type: Bug
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>         Attachments: PIG-5230.2.patch, PIG-5230.patch
>
>
> there is possibility that [curKey is null| 
> https://github.com/apache/pig/blob/63968e3132ad1fee06dffcacb8ea5d399e0edef5/src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SecondaryKeySortUtil.java#L116]
>  after PIG-5164.  we should remove the code to avoid RuntimeException.
> following script can trigger the exception.
> {code}
> a = load './studenttab10k.mk1' as (name, age:int, gpa:float);
> a1 = filter a by gpa is null or gpa >= 3.9;
> a2 = filter a by gpa < 2;
> b = union a1, a2;
> c = load './voternulltab10k' as (name, age, registration, contributions);
> d = join b by name left outer, c by name using 'replicated';
> e = stream d through `cat` as (name, age, gpa, name1, age1, registration, 
> contributions);
> f = foreach e generate name, age, gpa, registration, contributions;
> g = group f by name;
> g1 = group f by name; -- Two separate groupbys to ensure secondary key 
> partitioner
> h = foreach g { 
>     inner1 = order f by age, gpa, registration, contributions;
>     inner2 = limit inner1 1;
>     generate inner2, SUM(f.age); };
> i = foreach g1 {
>     inner1 = order f by age asc, gpa desc, registration asc, contributions 
> desc;
>     inner2 = limit inner1 1;
>     generate inner2, SUM(f.age); };
> store h into './MultiQuery_Union_3.1.out';
> store i into './MultiQuery_Union_3.2.out';
> {code}
> cat studenttab10k.mk1
> {code}
> ulysses thompson      64      1.90
> katie carson  25      3.65
>       65      0.73
> holly davidson        57      2.43
> fred miller   55      3.77
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to