GitHub user BryanCutler opened a pull request:
https://github.com/apache/spark/pull/20213
[SPARK-23018][PYTHON] Fix createDataFrame from Pandas timestamp series
assignment
## What changes were proposed in this pull request?
This fixes createDataFrame from Pandas to only assign modified timestamp
series back to a copied version of the Pandas DataFrame. Previously, if the
Pandas DataFrame was only a reference (e.g. a slice of another) each series
will still get assigned back to the reference even if it is not a modified
timestamp column. This caused the following warning "SettingWithCopyWarning: A
value is trying to be set on a copy of a slice from a DataFrame."
## How was this patch tested?
existing tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BryanCutler/spark
pyspark-createDataFrame-copy-slice-warn-SPARK-23018
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20213.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20213
----
commit bdeead620783df3d5b39897cba7001105b2816a7
Author: Bryan Cutler <cutlerb@...>
Date: 2018-01-09T23:51:25Z
Changed createDataFrame to only assign series if modified timestamp field
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]