Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/7056#issuecomment-117923839
Thanks for fixing this division problem. after rebasing with the fix, I
noticed one more issue w.r.t the accuracy of Decimal computation.
scala> val aa = Decimal(2) / Decimal(3);
aa: org.apache.spark.sql.types.Decimal = 1
when a Decimal is defined as Decimal.Unlimited, if we inherit the scale
value of the result from its parent, we will see big accuracy issue as shown in
the above example output, once we go coupe round of division over decimal data
vs. double data. below is a sample output from my run while testing my code
change, as you can see the result is far off from its double counterpart,
since you guys have been fixing issue around Decimal, would like to see if we
can work out a more proper fix in this context, is there guideline about
precision/scale settings for Decimal.Unlimited when it comes to division
operation?
10:27:46.042 WARN
org.apache.spark.sql.catalyst.expressions.CombinePartialStdFunction: COMBINE
STDDEV DOUBLE-------4.0 , 0.8VALUE
10:27:46.137 WARN
org.apache.spark.sql.catalyst.expressions.CombinePartialStdFunction: COMBINE
STDDEV DECIMAL-------4.29000 , 0.858VALUE
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]