[
https://issues.apache.org/jira/browse/PIG-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090938#comment-15090938
]
Daniel Dai commented on PIG-4774:
---------------------------------
Do you mind adding the script as test case?
> Fix NPE in AlgebraicMathBase UDFs
> ---------------------------------
>
> Key: PIG-4774
> URL: https://issues.apache.org/jira/browse/PIG-4774
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4774-1-withoutwhitespacechanges.patch,
> PIG-4774-1.patch
>
>
> For UDF backward compatibility issue after POStatus.STATUS_NULL refactory
> issue, PIG-4184 fixed the udfs to handle null by adding input.get(0) == null
> check in all the UDFs. UDFs extending AlgebraicMathBase which handle SUM,
> COUNT, etc was not fixed.
> Script to reproduce NPE. It is an odd usage doing aggregation after join
> instead of group by which one user was doing and rewrite moving aggregation
> after group by fixed the NPE. Might be rare, but there might be other cases
> where user call those functions with a bag directly without group by which
> might cause nulls to be passed to it.
> A = LOAD '/tmp/data' as (f1:int, f2:int, f3:int);
> B = LOAD '/tmp/data1' as (f1:int, f2:int, f3:int);
> A1 = GROUP A by f1;
> A2 = FOREACH A1 GENERATE group as f1, $1;
> C = JOIN B by f1 LEFT, A2 by f1;
> D = FOREACH C GENERATE B::f1, (double)SUM(A2::A.f3)/SUM(A2::A.f2);
> STORE D into '/tmp/out';
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)