[PR] [SPARK-47681][FOLLOWUP] Fix schema_of_variant(decimal). [spark]

via GitHub Sun, 12 May 2024 19:51:51 -0700


chenhao-db opened a new pull request, #46549:
URL: https://github.com/apache/spark/pull/46549


   ### What changes were proposed in this pull request?
   
   The PR https://github.com/apache/spark/pull/46338 found `schema_of_variant` 
sometimes could not correctly handle variant decimals and had a fix. However, I 
found that the fix is incomplete and `schema_of_variant` can still fail on some 
inputs. The reason is that `VariantUtil.getDecimal` calls `stripTrailingZeros`. 
For an input decimal `10.00`,  the resulting scale is -1 and the unscaled value 
is 1. However, negative decimal scale is not allowed by Spark. The correct 
approach is to use the `BigDecimal` to construct a `Decimal` and read its 
precision and scale, as what we did in `VariantGet`.
   
   This PR also includes a minor change for `VariantGet`, where a duplicated 
expression is computed twice.
   
   ### Why are the changes needed?
   
   They are bug fixes and are required to process decimals correctly.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   More unit tests. Some of them would fail without the change in this PR 
(e.g., `check("10.00", "DECIMAL(2,0)")`). Others wouldn't fail, but can still 
enhance test coverage.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-47681][FOLLOWUP] Fix schema_of_variant(decimal). [spark]

Reply via email to