GitHub user BryanCutler opened a pull request:
https://github.com/apache/spark/pull/19738
[SPARK-20791][PYTHON][FOLLOWUP] Check for unicode column names in
createDataFrame with Arrow
## What changes were proposed in this pull request?
If schema is passed as a list of unicode strings for column names, they
should be re-encoded to 'utf-8' to be consistent. This is similar to the
#13097 but for creation of DataFrame using Arrow.
## How was this patch tested?
Added new test of using unicode names for schema.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BryanCutler/spark
arrow-createDataFrame-followup-unicode-SPARK-20791
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19738.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19738
----
commit 1be220036cc405eaef5acb77802a15bceb81c314
Author: Bryan Cutler <[email protected]>
Date: 2017-11-13T19:11:19Z
moved re-encoding for unicode schema names to cover createDataFrame with
Arrow also
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]