I created an implementation that uses FastDateFormat and an AtomicReference.  I 
modified the benchmarks and the results are below.  atomicFormat is the new 
version. cachedFormat uses logic similar to what Remko proposed. fastFormat 
simply uses the FastDateFormat on every timestamp. synchronizedFormat uses 
logic similar to what is in the DatePatternConverter now. threadLocalFormat 
retrieves a threadLocal SimpleDateFormat and formats every timestamp with it.

1 Thread

Benchmark                                                     Mode  Samples    
Score   Error  Units
o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat          sample    67961   
93.774 ± 3.928  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline              sample    89151   
39.136 ± 2.352  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat          sample    66065   
74.033 ± 3.613  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat            sample    79434  
316.701 ± 6.443  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat    sample    81409  
304.055 ± 5.835  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat     sample    84137  
298.179 ± 6.005  ns/op

4 Threads

Benchmark                                                     Mode  Samples     
Score     Error  Units
o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat          sample   270722    
82.465 ±   1.767  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline              sample   290686    
46.817 ±   1.348  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat          sample   229500    
86.370 ±   2.102  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat            sample   228713   
435.373 ±  20.030  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat    sample   293448  
2113.380 ± 138.919  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat     sample   283040   
507.937 ±  15.903  ns/op

8 Threads

Benchmark                                                     Mode  Samples     
Score     Error  Units
o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat          sample   389028   
120.208 ±  39.351  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline              sample   523979    
62.972 ±   2.952  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat          sample   606880   
125.257 ±   7.320  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat            sample   470751   
768.610 ±  24.760  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat    sample   614805  
4598.192 ± 107.378  ns/op
o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat     sample   434334   
884.489 ± 130.365  ns/op

I have attached a new patch to the Jira issue.

Ralph





> On Jul 5, 2015, at 1:08 AM, Remko Popma <[email protected]> wrote:
> 
> Benchmark done 
> <https://issues.apache.org/jira/browse/LOG4J2-812?focusedCommentId=14614174&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14614174>,
>  and ThreadLocal is a clear winner over synchronized block.
> 
> I've started implementing a ThreadLocal registry but I stopped when I 
> realized this won't solve the memory leak problem: ThreadLocal.remove() only 
> removes the value for the current thread. Values created for other threads in 
> the thread pool will not be removed...
> 
> Instead, I propose we create a SoftThreadLocal class which wraps the value 
> (the SimpleDateFormat in this case) in a SoftReference before storing it in 
> the ThreadLocal. This allows the JVM to reclaim the memory when necessary, 
> which should solve the memory leak. It also means we no longer need to 
> explicitly clear() things when the LoggerContext is stopped.
> 
> This is also a reusable solution because we can use this class everywhere we 
> use ThreadLocal to cache objects for performance.
> 
> Thoughts?
> 
> On Sun, Jul 5, 2015 at 3:33 PM, Ralph Goers <[email protected] 
> <mailto:[email protected]>> wrote:
> That’s right, you were going to do some performance measurements. Please do 
> and let us know what works best.
> 
> Ralph
> 
>> On Jul 4, 2015, at 10:30 PM, Remko Popma <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Looks like we already have an outstanding Jira for this: LOG4J2-812 
>> <https://issues.apache.org/jira/browse/LOG4J2-812>.
>> If there are no objections I can take care of this one.
>> 
>> On Sun, Jul 5, 2015 at 9:55 AM, Remko Popma <[email protected] 
>> <mailto:[email protected]>> wrote:
>> One per logger context would make it easier to clear all ThreadContexts when 
>> a particular logger context is stopped. 
>> 
>> Sent from my iPhone
>> 
>> On 2015/07/05, at 0:51, Gary Gregory <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>>> Would there be one registry or one per logger context?
>>> 
>>> Gary 
>>> 
>>> 
>>> -------- Original message --------
>>> From: Remko Popma <[email protected] <mailto:[email protected]>> 
>>> Date: 07/04/2015 05:41 (GMT-08:00) 
>>> To: Log4J Developers List <[email protected] 
>>> <mailto:[email protected]>> 
>>> Subject: Re: Log4j2 RollingFileAppender deadlock issue 
>>> 
>>> ThreadLocal is implemented as an internal Map in each Thread instance, so 
>>> there is constant lookup time regardless of the number of Threads and the 
>>> number of lookups. Contrast this with a lock, where performance will 
>>> decrease exponentially with more concurrent threads. 
>>> 
>>> (See also https://plumbr.eu/blog/java/how-is-threadlocal-implemented 
>>> <https://plumbr.eu/blog/java/how-is-threadlocal-implemented> )
>>> 
>>> Sent from my iPhone
>>> 
>>> On 2015/07/04, at 20:40, Jess Holle <[email protected] <mailto:[email protected]>> 
>>> wrote:
>>> 
>>>> On 7/4/2015 2:51 AM, Gary Gregory wrote:
>>>>> On Thu, Jul 2, 2015 at 6:18 AM, Remko Popma <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>> Yes, that could still work: We could use a ThreadLocal containing a 
>>>>> custom class which holds the lastTimestamp, cachedDateString as well as a 
>>>>> SimpleDateFormat instance. 
>>>>> 
>>>>> As Jess pointed out, we would also need a way to clear the ThreadLocal 
>>>>> when the LoggerContext is stopped (to prevent memory leaks in web apps). 
>>>>> This may be the third usage of ThreadLocals in log4j2 now, so it may be 
>>>>> worth creating a reusable mechanism for this. 
>>>>> One idea would be to have a ThreadLocal registry in the LoggerContext, 
>>>>> where the LoggerContext is responsible for cleaning up all registered 
>>>>> ThreadLocals in its stop() method. 
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>>> I'm wondering what the performance cost are of doing a ThreadLocal.get() 
>>>>> vs. synchronized(this) on each call to format().
>>>>> 
>>>> Personally I'd be less concerned with optimizing maximum logger throughput 
>>>> on any given thread than:
>>>> Ensuring that not logging takes minimal time
>>>> Minimizing potential thread contention
>>>> Logging at maximum efficiency is a priority, but comes after these others.
>>>> --
>>>> Jess Holle
>>>> 
>> 
> 
> 

Reply via email to