Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11071#discussion_r51851099
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 ---
    @@ -55,10 +56,19 @@ object DateTimeUtils {
       // this is year -17999, calculation: 50 * daysIn400Year
       final val YearZero = -17999
       final val toYearZero = to2001 + 7304850
    -  final val TimeZoneGMT = TimeZone.getTimeZone("GMT")
     
       @transient lazy val defaultTimeZone = TimeZone.getDefault
     
    +  // Reuse the TimeZone object as it is expensive to create in each method 
call.
    +  final val timeZones = new ConcurrentHashMap[String, TimeZone]
    --- End diff --
    
    This map could be quite big, because the string varies. Actually 
`ZoneInfoFile` does provide a cache for different `ID`s. Let's find out whether 
the boost you mentioned comes from reusing `TimeZone` or `Calendar` instances.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to