GitHub user rberenguel opened a pull request:
https://github.com/apache/spark/pull/18137
[SPARK-20787][PYTHON] PySpark can't handle datetimes before 1900
`time.mktime` can't handle dates from 1899-100, according to the
documentation by design. `calendar.timegm` is equivalent in shared cases, but
can handle those years.
## What changes were proposed in this pull request?
Change `time.mktime` for the more able `calendar.timegm` to adress cases
like:
```python
import datetime as dt
sqlContext.createDataFrame(sc.parallelize([[dt.datetime(1899,12,31)]])).count()
```
failing due to internal conversion failure when there is no timezone
information in the time object. In the case there is information, `calendar`
was used instead.
## How was this patch tested?
The existing test cases should cover this change, since it should not
change any existing functionality.
This PR is original work from me and I license this work to the Spark
project
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rberenguel/spark SPARK-20787-invalid-years
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/18137.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #18137
----
commit 6c0312f94e3fce2bf4d6a30055bd747be535bb0f
Author: Ruben Berenguel Montoro <[email protected]>
Date: 2017-05-29T15:46:21Z
SPARK-20787 time.mktime canât handle dates from 1899-100, by
construction. calendar.timegm is equivalent in shared cases, but can handle
those
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]