GitHub user gberger opened a pull request:
https://github.com/apache/spark/pull/19792
[SPARK-22566][PYTHON] Better error message for `_merge_type` in Pandas to
Spark DF conversion
## What changes were proposed in this pull request?
It provides a better error message when doing
`spark_session.createDataFrame(pandas_df)` with no schema and an error occurs
in the schema inference due to incompatible types.
The Pandas column names are propagated down and the error message mentions
which column had the merging error.
https://issues.apache.org/jira/browse/SPARK-22566
## How was this patch tested?
Manually in the `./bin/pyspark` console, and with `./dev/run-tests`.
<img width="873" alt="screen shot 2017-11-21 at 13 29 49"
src="https://user-images.githubusercontent.com/3977115/33080121-382274e0-cecf-11e7-808f-057a65bb7b00.png">
I state that the contribution is my original work and that I license the
work to the Apache Spark project under the projectâs open source license.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gberger/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19792.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19792
----
commit 518fdd4f3d0e968cef2e3ba1b0220daee5ee7778
Author: Guilherme Berger <[email protected]>
Date: 2017-11-21T15:06:25Z
[SPARK-22566][PYTHON] Better error message for `_merge_type` in Pandas to
Spark DF conversion
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]