[GitHub] spark issue #19646: [SPARK-22147][PYTHON] Fix for createDataFrame from panda...

BryanCutler Thu, 02 Nov 2017 16:31:04 -0700

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/19646
  
    @ueshin and @HyukjinKwon this allows Spark to read non-arrow Pandas 
timestamps to TimestampTypes instead of long values, but there are a couple 
things to note.  I did the conversion with numpy because we can not make 
changes to the input pandas.DataFrame and making a copy is too expensive.  When 
`to_records()` is called, the pdf is changed to numpy records, and that is 
where the check/conversion is done.  For date columns, if it has a dtype of 
`datetime64[D]` or is a datetime object, then Spark correctly interprets to 
DateType.  Please take a look when you can, thanks!
    cc @cloud-fan



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19646: [SPARK-22147][PYTHON] Fix for createDataFrame from panda...

Reply via email to