[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

viirya Sun, 18 Nov 2018 17:39:46 -0800

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23054#discussion_r234475488
  
    --- Diff: docs/sql-migration-guide-upgrade.md ---
    @@ -17,6 +17,9 @@ displayTitle: Spark SQL Upgrading Guide
     
       - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
     
    +  - In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a 
grouped dataset with key attribute wrongly named as "value", if the key is 
atomic type, e.g. int, string, etc. This is counterintuitive and makes the 
schema of aggregation queries weird. For example, the schema of 
`ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the 
grouping attribute to "key". The old behaviour is preserved under a newly added 
configuration `spark.sql.legacy.atomicKeyAttributeGroupByKey` with a default 
value of `false`.
    --- End diff --
    
    Ok. More accurate.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

Reply via email to