Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/19646
@ueshin and @HyukjinKwon this allows Spark to read non-arrow Pandas
timestamps to TimestampTypes instead of long values, but there are a couple
things to note. I did the conversion with numpy because we can not make
changes to the input pandas.DataFrame and making a copy is too expensive. When
`to_records()` is called, the pdf is changed to numpy records, and that is
where the check/conversion is done. For date columns, if it has a dtype of
`datetime64[D]` or is a datetime object, then Spark correctly interprets to
DateType. Please take a look when you can, thanks!
cc @cloud-fan
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]