[ 
https://issues.apache.org/jira/browse/SPARK-24362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488464#comment-16488464
 ] 

Yuming Wang commented on SPARK-24362:
-------------------------------------

*{{SortMergeJoin}}* vs *{{BroadcastHashJoin}}*:
{code}
test("SPARK-24362") {
  val df = spark.range(6).toDF("c1")
  withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
    df.join(df, "c1").selectExpr("sum(cast(9.99 as double))").show()
  }

  withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "100000") {
    df.join(df, "c1").selectExpr("sum(cast(9.99 as double))").show()
  }
}
{code}
Results:
{noformat}
+------------------+
|               smj|
+------------------+
|59.940000000000005|
+------------------+

+-----+
|  bhj|
+-----+
|59.94|
+-----+
{noformat}

> SUM function precision issue
> ----------------------------
>
>                 Key: SPARK-24362
>                 URL: https://issues.apache.org/jira/browse/SPARK-24362
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Yuming Wang
>            Priority: Major
>
>  How to reproduce:
> {noformat}
> bin/spark-shell --conf spark.sql.autoBroadcastJoinThreshold=-1
> scala> val df = spark.range(6).toDF("c1")
> df: org.apache.spark.sql.DataFrame = [c1: bigint]
> scala> df.join(df, "c1").selectExpr("sum(cast(9.99 as double))").show()
> +-------------------------+
> |sum(CAST(9.99 AS DOUBLE))|
> +-------------------------+
> |       59.940000000000005|
> +-------------------------+{noformat}
>  
> More links:
> [https://stackoverflow.com/questions/42158844/about-a-loss-of-precision-when-calculating-an-aggregate-sum-with-data-frames]
> [https://stackoverflow.com/questions/44134497/spark-sql-sum-function-issues-on-double-value]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to