Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/22450
Let me answer to all your 3 points:
> how to prove Divide is the only one having problems of negative scale?
You can check the other operations. Let's go though all of them:
- Add and Subtract: in add you never use p1, p2 alone, but you do `p1-s1`
and `p2-s2`, so having a negative scale is the same as having scale 0 and
corresponding precision;
- Multiply: this is the easiest to handle. A negative scale can just cause
the scale of the result to be negative too. But that's right, in multiply scale
just moves the result to the right/left and this is what a negative scale does
too. There are no issues with it.
- Remainder/Pmod: same reason of Add and Subtract.
> how to prove the fix here is corrected and covers all the corner cases?
We can add more and more test cases, but let's check the logic of the
divide rule. In the precision we have 2 parts: the digits before the comma -
ie. intDigits- (`p1-s1+s2`) and the digits after - ie. the scale - (`max(6, s1
+ p2 + 1)`) which are summed and then we have the scale itself. The intDigits
are fine for the 1st operand as there is a `p1-s1` which natively handles the
negative scale, but if the second operand has a negative scale, the number of
intDigits can become negative. This is not something we can allow, as we would
end up with a precision lower than the scale, which we don't support. Hence, we
need to set the intDigits to 0 if they become negative. As far as the scale is
regarded, we already have a guard against a negative scale, as we get the max
of it and 6. But we have another problem, ie. we are using `p2` alone. So in
case `s2` is negative, in order to get the same `p2` we would have avoiding
negative scales, we need to adjust it to `p2 - s2`. No other case
s are present.
> Do you think it makes sense for
spark.sql.decimalOperations.allowPrecisionLoss to also toggle how literal
promotion happens (the old way vs. the new way)?
I think what we can do is forbidding negative scale when handling it
always, regardless of the value of that flag. I think this can safely be done,
as negative scales in this case were not supported before 2.3. But anyway, this
would just reduce the number of cases when this can happen... So I think we can
do that, but it is not a definitive solution. If you want, I can do this in
scope of this PR or I can create a new one or we can do this in the PR you just
closed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]