[ 
https://issues.apache.org/jira/browse/LOG4J2-3672?focusedWorklogId=884445&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-884445
 ]

ASF GitHub Bot logged work on LOG4J2-3672:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Oct/23 09:26
            Start Date: 11/Oct/23 09:26
    Worklog Time Spent: 10m 
      Work Description: tristantarrant commented on PR #1848:
URL: https://github.com/apache/logging-log4j2/pull/1848#issuecomment-1757250950

   Hi @vy, I agree we need a better solution.
   
   - avoid constructing that ridiculous regex with all the timezone names 
anyway. A simple `[A-Za-z0-9\+\-/\s]+` should be enough
   - if the user uses a `GMT-xx` or RFC822-style timezone, use the quick path
   - if the user uses a _primary_ timezone name (e.g. `Europe/Rome`), use 
`TimeZone.getTimeZone()`
   - otherwise fallback to obtaining the full list of zone names and aliases. 
Unfortunately there is no other way to look up timezones by alias... By using 
the approach used in my PR we can save ~700K of heap over the use of 
`DateFormatSymbols.getInstance(locale).getZoneStrings();` (as you can see from 
the table I've put in the Jira)
   
   Unfortunately this does not solve the bigger issue: if someone uses 
`SimpleDateFormat` in their code, they are going to trigger the full JDK 
caching of timezones, rendering our fix useless anyway.
   
   A possibility, aside from fixing this in the JDK in some way, is to install 
an alternate `ZoneRulesProvider` using the 
`java.time.zone.DefaultZoneRulesProvider` system property. Such a provider 
would avoid caching the full TZ dataset. 




Issue Time Tracking
-------------------

    Worklog Id:     (was: 884445)
    Time Spent: 0.5h  (was: 20m)

> Avoid invoking DateFormatSymbols.getZoneStrings() in FastDateParser
> -------------------------------------------------------------------
>
>                 Key: LOG4J2-3672
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-3672
>             Project: Log4j 2
>          Issue Type: Bug
>            Reporter: Tristan Tarrant
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{FastDateParser}} uses {{DateFormatSymbols.getZoneStrings()}} to construct a 
> table of all possible timezone names to be used in parsing date patterns in 
> pattern layouts.
> Unfortunately the above call (and the equivalent call used by the JDK's 
> {{SimpleDateFormat)}} causes initialization and caching of all timezones, 
> resulting in a ~3MB heap overhead on x86_64. The following table summarizes 
> the cost of triggering the caching of all timezones, including the number of 
> instances of some related types and the amount of extra heap required.
>  
> || ||LocalDateTime||LocalDate||ZoneInfo||ZoneOffset||Heap delta||
> |Baseline (no TZ calls)|180|0|0| | |
> |Single timezone|180|0|0|0|298|
> |DateFormatSymbols.getZoneStrings()|57076|32212|602|1455|3760106|
> |TimeZone.getAvailableIds() + TimeZone.getName()|36678|21674|632|1155|3024946|
> |TimeZone.getAvalableIDs()|180|0|632|0|452578|
> By avoiding constructions of such tables, and relying only on 
> {{{}FastDateParser{}}}'s support for RFC-822 and GMT-style timezone names, we 
> can avoid allocating the extra heap.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to