[GitHub] [spark] BrennanStein opened a new pull request, #37659: [SPARK-40212] [SQL] SparkSQL castPartValue does not properly handle byte & short

GitBox Thu, 25 Aug 2022 07:24:35 -0700


BrennanStein opened a new pull request, #37659:
URL: https://github.com/apache/spark/pull/37659


   ### What changes were proposed in this pull request?
   The `castPartValueToDesiredType` function now returns byte for ByteType and 
short for ShortType, rather than ints.
   
   ### Why are the changes needed?
   Previously, attempting to read back in a file partitioned on a byte-type 
column would result in a `java.lang.ClassCastException: java.lang.Integer 
cannot be cast to java.lang.Byte` exception at runtime, presumably due to this 
function returning an integer rather than a byte (or short). I can't think this 
is anything but a bug, as returning the correct data type prevents the crash.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes: it changes the observed behavior when reading in a 
byte/short-partitioned file.
   
   ### How was this patch tested?
   Added unit test. Without the `castPartValueToDesiredType` updates, the test 
fails with the stated exception.
   
   ===
   I'll note that I'm not familiar enough with the spark repo to know if this 
will have ripple effects elsewhere, but tests pass on my fork and since the 
very similar https://github.com/apache/spark/pull/36344/files only needed to 
touch these two files I expect this change is self-contained as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] BrennanStein opened a new pull request, #37659: [SPARK-40212] [SQL] SparkSQL castPartValue does not properly handle byte & short

Reply via email to