Maciej Bryński created SPARK-22010:
--------------------------------------

             Summary: Slow fromInternal conversion for TimestampType
                 Key: SPARK-22010
                 URL: https://issues.apache.org/jira/browse/SPARK-22010
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.2.0
            Reporter: Maciej Bryński


To convert timestamp type to python we are using 
`datetime.datetime.fromtimestamp(ts // 1000000).replace(microsecond=ts % 
1000000)`
code.

{code}
In [34]: %%timeit
    ...: datetime.datetime.fromtimestamp(1505383647).replace(microsecond=12344)
    ...:
4.2 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
{code}

It's slow, because:
# we're trying to get TZ on every conversion
# we're using replace method

Proposed solution: custom datetime conversion and move calculation of TZ to 
module



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to