cloud-fan commented on a change in pull request #23388: [SPARK-26448][SQL] 
retain the difference between 0.0 and -0.0
URL: https://github.com/apache/spark/pull/23388#discussion_r244092877
 
 

 ##########
 File path: docs/sql-migration-guide-upgrade.md
 ##########
 @@ -25,8 +25,6 @@ displayTitle: Spark SQL Upgrading Guide
 
   - In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a 
grouped dataset with key attribute wrongly named as "value", if the key is 
non-struct type, e.g. int, string, array, etc. This is counterintuitive and 
makes the schema of aggregation queries weird. For example, the schema of 
`ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the 
grouping attribute to "key". The old behaviour is preserved under a newly added 
configuration `spark.sql.legacy.dataset.nameNonStructGroupingKeyAsValue` with a 
default value of `false`.
 
-  - In Spark version 2.4 and earlier, float/double -0.0 is semantically equal 
to 0.0, but users can still distinguish them via `Dataset.show`, 
`Dataset.collect` etc. Since Spark 3.0, float/double -0.0 is replaced by 0.0 
internally, and users can't distinguish them any more.
 
 Review comment:
   checkout the [test 
case](https://github.com/apache/spark/pull/23388/files#diff-4c0b1f729d651b04f14e72260555f623R397),
 "distinguish -0.0" is not about agg or join.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to