Is there a way to know how long each benchmark ran? I want to make sure that the benchmark is fairly representing non-cached results.
Ralph > On Jul 5, 2015, at 5:57 PM, Ralph Goers <[email protected]> wrote: > > I created an implementation that uses FastDateFormat and an AtomicReference. > I modified the benchmarks and the results are below. atomicFormat is the new > version. cachedFormat uses logic similar to what Remko proposed. fastFormat > simply uses the FastDateFormat on every timestamp. synchronizedFormat uses > logic similar to what is in the DatePatternConverter now. threadLocalFormat > retrieves a threadLocal SimpleDateFormat and formats every timestamp with it. > > 1 Thread > > Benchmark Mode Samples > Score Error Units > o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat sample 67961 > 93.774 ± 3.928 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline sample 89151 > 39.136 ± 2.352 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat sample 66065 > 74.033 ± 3.613 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat sample 79434 > 316.701 ± 6.443 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat sample 81409 > 304.055 ± 5.835 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat sample 84137 > 298.179 ± 6.005 ns/op > > 4 Threads > > Benchmark Mode Samples > Score Error Units > o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat sample 270722 > 82.465 ± 1.767 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline sample 290686 > 46.817 ± 1.348 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat sample 229500 > 86.370 ± 2.102 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat sample 228713 > 435.373 ± 20.030 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat sample 293448 > 2113.380 ± 138.919 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat sample 283040 > 507.937 ± 15.903 ns/op > > 8 Threads > > Benchmark Mode Samples > Score Error Units > o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat sample 389028 > 120.208 ± 39.351 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline sample 523979 > 62.972 ± 2.952 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat sample 606880 > 125.257 ± 7.320 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat sample 470751 > 768.610 ± 24.760 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat sample 614805 > 4598.192 ± 107.378 ns/op > o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat sample 434334 > 884.489 ± 130.365 ns/op > > I have attached a new patch to the Jira issue. > > Ralph > > > > > >> On Jul 5, 2015, at 1:08 AM, Remko Popma <[email protected] >> <mailto:[email protected]>> wrote: >> >> Benchmark done >> <https://issues.apache.org/jira/browse/LOG4J2-812?focusedCommentId=14614174&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14614174>, >> and ThreadLocal is a clear winner over synchronized block. >> >> I've started implementing a ThreadLocal registry but I stopped when I >> realized this won't solve the memory leak problem: ThreadLocal.remove() only >> removes the value for the current thread. Values created for other threads >> in the thread pool will not be removed... >> >> Instead, I propose we create a SoftThreadLocal class which wraps the value >> (the SimpleDateFormat in this case) in a SoftReference before storing it in >> the ThreadLocal. This allows the JVM to reclaim the memory when necessary, >> which should solve the memory leak. It also means we no longer need to >> explicitly clear() things when the LoggerContext is stopped. >> >> This is also a reusable solution because we can use this class everywhere we >> use ThreadLocal to cache objects for performance. >> >> Thoughts? >> >> On Sun, Jul 5, 2015 at 3:33 PM, Ralph Goers <[email protected] >> <mailto:[email protected]>> wrote: >> That’s right, you were going to do some performance measurements. Please do >> and let us know what works best. >> >> Ralph >> >>> On Jul 4, 2015, at 10:30 PM, Remko Popma <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Looks like we already have an outstanding Jira for this: LOG4J2-812 >>> <https://issues.apache.org/jira/browse/LOG4J2-812>. >>> If there are no objections I can take care of this one. >>> >>> On Sun, Jul 5, 2015 at 9:55 AM, Remko Popma <[email protected] >>> <mailto:[email protected]>> wrote: >>> One per logger context would make it easier to clear all ThreadContexts >>> when a particular logger context is stopped. >>> >>> Sent from my iPhone >>> >>> On 2015/07/05, at 0:51, Gary Gregory <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>>> Would there be one registry or one per logger context? >>>> >>>> Gary >>>> >>>> >>>> -------- Original message -------- >>>> From: Remko Popma <[email protected] <mailto:[email protected]>> >>>> Date: 07/04/2015 05:41 (GMT-08:00) >>>> To: Log4J Developers List <[email protected] >>>> <mailto:[email protected]>> >>>> Subject: Re: Log4j2 RollingFileAppender deadlock issue >>>> >>>> ThreadLocal is implemented as an internal Map in each Thread instance, so >>>> there is constant lookup time regardless of the number of Threads and the >>>> number of lookups. Contrast this with a lock, where performance will >>>> decrease exponentially with more concurrent threads. >>>> >>>> (See also https://plumbr.eu/blog/java/how-is-threadlocal-implemented >>>> <https://plumbr.eu/blog/java/how-is-threadlocal-implemented> ) >>>> >>>> Sent from my iPhone >>>> >>>> On 2015/07/04, at 20:40, Jess Holle <[email protected] <mailto:[email protected]>> >>>> wrote: >>>> >>>>> On 7/4/2015 2:51 AM, Gary Gregory wrote: >>>>>> On Thu, Jul 2, 2015 at 6:18 AM, Remko Popma <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> Yes, that could still work: We could use a ThreadLocal containing a >>>>>> custom class which holds the lastTimestamp, cachedDateString as well as >>>>>> a SimpleDateFormat instance. >>>>>> >>>>>> As Jess pointed out, we would also need a way to clear the ThreadLocal >>>>>> when the LoggerContext is stopped (to prevent memory leaks in web apps). >>>>>> This may be the third usage of ThreadLocals in log4j2 now, so it may be >>>>>> worth creating a reusable mechanism for this. >>>>>> One idea would be to have a ThreadLocal registry in the LoggerContext, >>>>>> where the LoggerContext is responsible for cleaning up all registered >>>>>> ThreadLocals in its stop() method. >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> I'm wondering what the performance cost are of doing a ThreadLocal.get() >>>>>> vs. synchronized(this) on each call to format(). >>>>>> >>>>> Personally I'd be less concerned with optimizing maximum logger >>>>> throughput on any given thread than: >>>>> Ensuring that not logging takes minimal time >>>>> Minimizing potential thread contention >>>>> Logging at maximum efficiency is a priority, but comes after these others. >>>>> -- >>>>> Jess Holle >>>>> >>> >> >> >
