GitHub user rberenguel opened a pull request:

    https://github.com/apache/spark/pull/18137

    [SPARK-20787][PYTHON] PySpark can't handle datetimes before 1900 

    `time.mktime` can't handle dates from 1899-100, according to the 
documentation by design. `calendar.timegm` is equivalent in shared cases, but 
can handle those years.
    
    ## What changes were proposed in this pull request?
    
    Change `time.mktime` for the more able `calendar.timegm` to adress cases 
like:
    ```python
    import datetime as dt
    
sqlContext.createDataFrame(sc.parallelize([[dt.datetime(1899,12,31)]])).count()
    ```
    failing due to internal conversion failure when there is no timezone 
information in the time object. In the case there is information, `calendar` 
was used instead.
    
    ## How was this patch tested?
    
    The existing test cases should cover this change, since it should not 
change any existing functionality.
    
    This PR is original work from me and I license this work to the Spark 
project

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rberenguel/spark SPARK-20787-invalid-years

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18137.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18137
    
----
commit 6c0312f94e3fce2bf4d6a30055bd747be535bb0f
Author: Ruben Berenguel Montoro <[email protected]>
Date:   2017-05-29T15:46:21Z

    SPARK-20787 time.mktime can’t handle dates from 1899-100, by 
construction. calendar.timegm is equivalent in shared cases, but can handle 
those

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to