Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23054#discussion_r234475156
  
    --- Diff: docs/sql-migration-guide-upgrade.md ---
    @@ -17,6 +17,9 @@ displayTitle: Spark SQL Upgrading Guide
     
       - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
     
    +  - In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a 
grouped dataset with key attribute wrongly named as "value", if the key is 
atomic type, e.g. int, string, etc. This is counterintuitive and makes the 
schema of aggregation queries weird. For example, the schema of 
`ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the 
grouping attribute to "key". The old behaviour is preserved under a newly added 
configuration `spark.sql.legacy.atomicKeyAttributeGroupByKey` with a default 
value of `false`.
    --- End diff --
    
    I realized that, only struct type key has the `key` alias. So here we 
should say: `if the key is non-struct type, e.g. int, string, array, etc.`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to