GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/20280
[SPARK-22232][PYTHON][SQL] Fixed Row pickling to include __from_dict__ flag ## What changes were proposed in this pull request? When a `Row` object is created using kwargs, the order of the keywords can not be relied upon (except for Python 3.5 that uses an OrderedDict). The fields are sorted in the constructor and a flag `__from_dict__` is set to indicate that this object was created from kwargs so that other areas in Spark can access row data using field names instead of by position. This change includes the `__from_dict__` flag only when pickling a Row that was made from kwargs so that the behavior is preserved if the Row becomes pickled. ## How was this patch tested? Fixed existing tests that relied on fields and schema being in the same alphabetical order. Added new test to create `Row` from positional arguments where order matters. You can merge this pull request into a Git repository by running: $ git pull https://github.com/BryanCutler/spark pyspark-Row-serialize-SPARK-22232 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20280.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20280 ---- ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org