[GitHub] [spark] HyukjinKwon commented on a change in pull request #26496: [SPARK-29748][PYTHON][SQL] Remove Row field sorting in PySpark for version 3.6+

GitBox Thu, 13 Feb 2020 17:52:12 -0800

HyukjinKwon commented on a change in pull request #26496: 
[SPARK-29748][PYTHON][SQL] Remove Row field sorting in PySpark for version 3.6+
URL: https://github.com/apache/spark/pull/26496#discussion_r379215351


 ##########
 File path: docs/pyspark-migration-guide.md
 ##########
 @@ -87,6 +87,8 @@ Please refer [Migration Guide: SQL, Datasets and 
DataFrame](sql-migration-guide.
   - Since Spark 3.0, `Column.getItem` is fixed such that it does not call 
`Column.apply`. Consequently, if `Column` is used as an argument to `getItem`, 
the indexing operator should be used.
     For example, `map_col.getItem(col('id'))` should be replaced with 
`map_col[col('id')]`.
 
+  - As of Spark 3.0 `Row` field names are no longer sorted alphabetically when 
constructing with named arguments for Python versions 3.6 and above, and the 
order of fields will match that as entered. To enable sorted fields by default, 
as in Spark 2.4, set the environment variable 
`PYSPARK_ROW_FIELD_SORTING_ENABLED` to "true". For Python versions less than 
3.6, the field names will be sorted alphabetically as the only option.
 
 Review comment:
   +1. Let me fix it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26496: [SPARK-29748][PYTHON][SQL] Remove Row field sorting in PySpark for version 3.6+

Reply via email to