skambha commented on issue #27627: [WIP][SPARK-28067][SQL] Fix incorrect 
results for decimal aggregate sum by returning null on decimal overflow
URL: https://github.com/apache/spark/pull/27627#issuecomment-595411413
 
 
   
   In the repro (that has the join and then the agg),  we also see the null 
being written out via the UnsafeRowWriter that is mentioned in my earlier 
comment with codegen details 
here.https://github.com/apache/spark/pull/27627#issuecomment-591185451
   
   Codegen and the plan details from 
https://github.com/apache/spark/pull/27627#issuecomment-591185451 here:
   
https://github.com/skambha/notes/blob/master/spark28067_ansifalse_wholestagetrue.txt
   
   
   ```
   scala> df2.queryExecution.debug.codegen
   Found 2 WholeStageCodegen subtrees.
   == Subtree 1 / 2 (maxMethodCodeSize:173; maxConstantPoolSize:145(0.22% 
used); numInnerClasses:0) ==
   *(2) HashAggregate(keys=[], functions=[sum(decNum#14)], 
output=[sum(decNum)#23])
   +- Exchange SinglePartition, true, [id=#77]
      +- *(1) HashAggregate(keys=[], functions=[partial_sum(decNum#14)], 
output=[sum#29])
         +- *(1) Project [decNum#14]
            +- *(1) BroadcastHashJoin [intNum#8], [intNum#15], Inner, BuildLeft
               :- BroadcastExchange 
HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint))), 
[id=#65]
               :  +- LocalTableScan [intNum#8]
               +- *(1) LocalTableScan [decNum#14, intNum#15]
   ```
   
   
   The relevant code lines of interest are the following: 
   - Addition of two decimal values ( expression coming from sum)  that results 
in value that cannot be contained. 
   - Writing a big overflow decimal using UnsafeWriter —> which will write null 
silently.
   
   Codegen id1: 
   /* 108 */         agg_value_3 = agg_value_4.$plus(agg_expr_0_0);
   /* 113 */         agg_value_2 = agg_value_3;
   /* 126 */     agg_mutableStateArray_0[0] = agg_value_2;
   /* 147 */         bhj_mutableStateArray_0[3].write(0, 
agg_mutableStateArray_0[0], 38, 18);
   
https://github.com/skambha/notes/blob/master/spark28067_ansifalse_wholestagetrue.txt#L336
   
https://github.com/skambha/notes/blob/master/spark28067_ansifalse_wholestagetrue.txt#L354
   
https://github.com/skambha/notes/blob/master/spark28067_ansifalse_wholestagetrue.txt#L375
   
   Codegen id2:
   /* 080 */         agg_value_5 = agg_value_6.$plus(agg_expr_0_0);
   /* 091 */         agg_value_4 = agg_mutableStateArray_0[0];
   /* 098 */     agg_mutableStateArray_0[0] = agg_value_4;
   /* 128 */         agg_mutableStateArray_1[0].write(0, agg_value_1, 38, 18);
   
https://github.com/skambha/notes/blob/master/spark28067_ansifalse_wholestagetrue.txt#L164
   
https://github.com/skambha/notes/blob/master/spark28067_ansifalse_wholestagetrue.txt#L212
   
   - For both partial and final stage of aggregate, we can see the same issue 
that can happen.  
   
   @cloud-fan,  given the above pieces of code, why would the problem not exist 
for the partial aggregate (updateExpression) case? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to