[ https://issues.apache.org/jira/browse/SPARK-29123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933128#comment-16933128 ]
Marco Gaido commented on SPARK-29123: ------------------------------------- You can set {{spark.sql.decimalOperations.allowPrecisionLoss}} if you do not want to risk truncations in your operations. Otherwise, tuning properly the precision and scale of your input schema helps too. > DecimalType multiplication precision loss > ------------------------------------------ > > Key: SPARK-29123 > URL: https://issues.apache.org/jira/browse/SPARK-29123 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.4.3 > Reporter: Benny Lu > Priority: Major > > When doing multiplication with PySpark, it seems PySpark is losing precision. > For example, when multiplying two decimals with precision 38,10, it returns > 38,6 instead of 38,10. It also truncates result to three decimals which is > incorrect result. > {code:java} > from decimal import Decimal > from pyspark.sql.types import DecimalType, StructType, StructField > schema = StructType([StructField("amount", DecimalType(38,10)), > StructField("fx", DecimalType(38,10))]) > df = spark.createDataFrame([(Decimal(233.00), Decimal(1.1403218880))], > schema=schema) > df.printSchema() > df = df.withColumn("amount_usd", df.amount * df.fx) > df.printSchema() > df.show() > {code} > Result > {code:java} > >>> df.printSchema() > root > |-- amount: decimal(38,10) (nullable = true) > |-- fx: decimal(38,10) (nullable = true) > |-- amount_usd: decimal(38,6) (nullable = true) > >>> df = df.withColumn("amount_usd", df.amount * df.fx) > >>> df.printSchema() > root > |-- amount: decimal(38,10) (nullable = true) > |-- fx: decimal(38,10) (nullable = true) > |-- amount_usd: decimal(38,6) (nullable = true) > >>> df.show() > +--------------+------------+----------+ > | amount| fx|amount_usd| > +--------------+------------+----------+ > |233.0000000000|1.1403218880|265.695000| > +--------------+------------+----------+ > {code} > When rounding to two decimals, it returns 265.70 but the correct result > should be 265.69499 and when rounded to two decimals, it should be 265.69. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org