[GitHub] spark pull request #17933: [SPARK-20588][SQL] Cache TimeZone instances per t...

ueshin Wed, 10 May 2017 00:05:52 -0700

GitHub user ueshin opened a pull request:

    https://github.com/apache/spark/pull/17933


    [SPARK-20588][SQL] Cache TimeZone instances per thread.

    ## What changes were proposed in this pull request?
    
    Because the method `TimeZone.getTimeZone(String ID)` is synchronized on the 
TimeZone class, concurrent call of this method will become a bottleneck.
    This especially happens when casting from string value containing timezone 
info to timestamp value, which uses `DateTimeUtils.stringToTimestamp()` and 
gets TimeZone instance on the site.
    
    This pr makes a cache of the generated TimeZone instances per thread to 
avoid the synchronization.
    
    ## How was this patch tested?
    
    Existing tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ueshin/apache-spark issues/SPARK-20588

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17933.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17933
    
----
commit de79e50779c0f2e17ea26301ac7d1216b37331c9
Author: Takuya UESHIN <[email protected]>
Date:   2017-05-10T05:55:53Z

    Cache TimeZone instances per thread.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #17933: [SPARK-20588][SQL] Cache TimeZone instances per t...

Reply via email to