[GitHub] cloud-fan commented on a change in pull request #23388: [SPARK-26448][SQL] retain the difference between 0.0 and -0.0

GitBox Wed, 26 Dec 2018 22:55:03 -0800

cloud-fan commented on a change in pull request #23388: [SPARK-26448][SQL] 
retain the difference between 0.0 and -0.0
URL: https://github.com/apache/spark/pull/23388#discussion_r244092877

##########
File path: docs/sql-migration-guide-upgrade.md
##########
@@ -25,8 +25,6 @@ displayTitle: Spark SQL Upgrading Guide

- In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a
grouped dataset with key attribute wrongly named as "value", if the key is
non-struct type, e.g. int, string, array, etc. This is counterintuitive and
makes the schema of aggregation queries weird. For example, the schema of
`ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the
grouping attribute to "key". The old behaviour is preserved under a newly added
configuration `spark.sql.legacy.dataset.nameNonStructGroupingKeyAsValue` with a
default value of `false`.

- - In Spark version 2.4 and earlier, float/double -0.0 is semantically equal
to 0.0, but users can still distinguish them via `Dataset.show`,
`Dataset.collect` etc. Since Spark 3.0, float/double -0.0 is replaced by 0.0
internally, and users can't distinguish them any more.

Review comment:
checkout the [test
case](https://github.com/apache/spark/pull/23388/files#diff-4c0b1f729d651b04f14e72260555f623R397),
"distinguish -0.0" is not about agg or join.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] cloud-fan commented on a change in pull request #23388: [SPARK-26448][SQL] retain the difference between 0.0 and -0.0

Reply via email to