HeartSaVioR opened a new pull request #27747: [SPARK-30993][SQL] Use its sql 
type for UDT when checking the type of length (fixed/var) or mutable
URL: https://github.com/apache/spark/pull/27747
 
 
   ### What changes were proposed in this pull request?
   
   This patch fixes the bug of UnsafeRow which misses to handle the UDT 
specifically, in `isFixedLength` and `isMutable`. These methods don't check its 
SQL type for UDT, treating UDT as variable-length, and non-mutable.
   
   It doesn't bring any issue if UDT is used to represent complicated 
(non-primitive) type, but when UDT is used to represent some type which is 
matched with primitive SQL type, it exposes the chance of correctness issues, 
as these informations represent how the value should be handled.
   
   We got report from user mailing list which suspected as mapGroupsWithState 
looks like handling UDT incorrectly, but after some investigation it was from 
GenerateUnsafeRowJoiner in shuffle phase.
   
   
https://github.com/apache/spark/blob/0e2ca11d80c3921387d7b077cb64c3a0c06b08d7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeRowJoiner.scala#L32-L43
   
   Here updating position should not happen on fixed-length column, but due to 
this bug, the value of all UDT would be modified, which actually corrupts the 
value.
   
   ### Why are the changes needed?
   
   Misclassifying of the type of length for UDT can corrupt the value when the 
row is presented to the input of GenerateUnsafeRowJoiner, which brings 
correctness issue.
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New UT added.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to