Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20213#discussion_r160588222
--- Diff: python/pyspark/sql/session.py ---
@@ -459,21 +459,23 @@ def _convert_from_pandas(self, pdf, schema, timezone):
# TODO: handle nested timestamps, such as
ArrayType(TimestampType())?
if isinstance(field.dataType, TimestampType):
s =
_check_series_convert_timestamps_tz_local(pdf[field.name], timezone)
- if not copied and s is not pdf[field.name]:
- # Copy once if the series is modified to
prevent the original Pandas
- # DataFrame from being updated
- pdf = pdf.copy()
- copied = True
- pdf[field.name] = s
+ if s is not pdf[field.name]:
+ if not copied:
--- End diff --
BTW, what's diff between:
```
if s is not pdf[field.name]:
if not copied:
```
vs
```
if not copied and s is not pdf[field.name]:
```
?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]