I think the benchmark output includes runtime duration. Sent from my iPhone
> On 2015/07/06, at 11:07, Ralph Goers <[email protected]> wrote: > > Is there a way to know how long each benchmark ran? I want to make sure that > the benchmark is fairly representing non-cached results. > > Ralph > >> On Jul 5, 2015, at 5:57 PM, Ralph Goers <[email protected]> wrote: >> >> I created an implementation that uses FastDateFormat and an AtomicReference. >> I modified the benchmarks and the results are below. atomicFormat is the >> new version. cachedFormat uses logic similar to what Remko proposed. >> fastFormat simply uses the FastDateFormat on every timestamp. >> synchronizedFormat uses logic similar to what is in the DatePatternConverter >> now. threadLocalFormat retrieves a threadLocal SimpleDateFormat and formats >> every timestamp with it. >> >> 1 Thread >> >> Benchmark Mode Samples >> Score Error Units >> o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat sample 67961 >> 93.774 ± 3.928 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline sample 89151 >> 39.136 ± 2.352 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat sample 66065 >> 74.033 ± 3.613 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat sample 79434 >> 316.701 ± 6.443 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat sample 81409 >> 304.055 ± 5.835 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat sample 84137 >> 298.179 ± 6.005 ns/op >> >> 4 Threads >> >> Benchmark Mode Samples >> Score Error Units >> o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat sample 270722 >> 82.465 ± 1.767 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline sample 290686 >> 46.817 ± 1.348 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat sample 229500 >> 86.370 ± 2.102 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat sample 228713 >> 435.373 ± 20.030 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat sample 293448 >> 2113.380 ± 138.919 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat sample 283040 >> 507.937 ± 15.903 ns/op >> >> 8 Threads >> >> Benchmark Mode Samples >> Score Error Units >> o.a.l.l.p.j.SimpleDateFormatBenchmark.atomicFormat sample 389028 >> 120.208 ± 39.351 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.baseline sample 523979 >> 62.972 ± 2.952 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.cachedFormat sample 606880 >> 125.257 ± 7.320 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.fastFormat sample 470751 >> 768.610 ± 24.760 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.synchronizedFormat sample 614805 >> 4598.192 ± 107.378 ns/op >> o.a.l.l.p.j.SimpleDateFormatBenchmark.threadLocalFormat sample 434334 >> 884.489 ± 130.365 ns/op >> >> I have attached a new patch to the Jira issue. >> >> Ralph >> >> >> >> >> >>> On Jul 5, 2015, at 1:08 AM, Remko Popma <[email protected]> wrote: >>> >>> Benchmark done, and ThreadLocal is a clear winner over synchronized block. >>> >>> I've started implementing a ThreadLocal registry but I stopped when I >>> realized this won't solve the memory leak problem: ThreadLocal.remove() >>> only removes the value for the current thread. Values created for other >>> threads in the thread pool will not be removed... >>> >>> Instead, I propose we create a SoftThreadLocal class which wraps the value >>> (the SimpleDateFormat in this case) in a SoftReference before storing it in >>> the ThreadLocal. This allows the JVM to reclaim the memory when necessary, >>> which should solve the memory leak. It also means we no longer need to >>> explicitly clear() things when the LoggerContext is stopped. >>> >>> This is also a reusable solution because we can use this class everywhere >>> we use ThreadLocal to cache objects for performance. >>> >>> Thoughts? >>> >>>> On Sun, Jul 5, 2015 at 3:33 PM, Ralph Goers <[email protected]> >>>> wrote: >>>> That’s right, you were going to do some performance measurements. Please >>>> do and let us know what works best. >>>> >>>> Ralph >>>> >>>>> On Jul 4, 2015, at 10:30 PM, Remko Popma <[email protected]> wrote: >>>>> >>>>> Looks like we already have an outstanding Jira for this: LOG4J2-812. >>>>> If there are no objections I can take care of this one. >>>>> >>>>>> On Sun, Jul 5, 2015 at 9:55 AM, Remko Popma <[email protected]> >>>>>> wrote: >>>>>> One per logger context would make it easier to clear all ThreadContexts >>>>>> when a particular logger context is stopped. >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>>> On 2015/07/05, at 0:51, Gary Gregory <[email protected]> wrote: >>>>>>> >>>>>>> Would there be one registry or one per logger context? >>>>>>> >>>>>>> Gary >>>>>>> >>>>>>> >>>>>>> -------- Original message -------- >>>>>>> From: Remko Popma <[email protected]> >>>>>>> Date: 07/04/2015 05:41 (GMT-08:00) >>>>>>> To: Log4J Developers List <[email protected]> >>>>>>> Subject: Re: Log4j2 RollingFileAppender deadlock issue >>>>>>> >>>>>>> ThreadLocal is implemented as an internal Map in each Thread instance, >>>>>>> so there is constant lookup time regardless of the number of Threads >>>>>>> and the number of lookups. Contrast this with a lock, where performance >>>>>>> will decrease exponentially with more concurrent threads. >>>>>>> >>>>>>> (See also https://plumbr.eu/blog/java/how-is-threadlocal-implemented ) >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>>> On 2015/07/04, at 20:40, Jess Holle <[email protected]> wrote: >>>>>>>> >>>>>>>>> On 7/4/2015 2:51 AM, Gary Gregory wrote: >>>>>>>>>> On Thu, Jul 2, 2015 at 6:18 AM, Remko Popma <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> Yes, that could still work: We could use a ThreadLocal containing a >>>>>>>>>> custom class which holds the lastTimestamp, >>>>>>>>>> cachedDateString as well as a SimpleDateFormat instance. >>>>>>>>>> >>>>>>>>>> As Jess pointed out, we would also need a way to clear the >>>>>>>>>> ThreadLocal when the LoggerContext is stopped (to prevent memory >>>>>>>>>> leaks in web apps). This may be the third usage of ThreadLocals in >>>>>>>>>> log4j2 now, so it may be worth creating a reusable mechanism for >>>>>>>>>> this. >>>>>>>>>> One idea would be to have a ThreadLocal registry in the >>>>>>>>>> LoggerContext, where the LoggerContext is responsible for cleaning >>>>>>>>>> up all registered ThreadLocals in its stop() method. >>>>>>>>>> >>>>>>>>>> Thoughts? >>>>>>>>> >>>>>>>>> I'm wondering what the performance cost are of doing a >>>>>>>>> ThreadLocal.get() vs. synchronized(this) on each call to format(). >>>>>>>> Personally I'd be less concerned with optimizing maximum logger >>>>>>>> throughput on any given thread than: >>>>>>>> Ensuring that not logging takes minimal time >>>>>>>> Minimizing potential thread contention >>>>>>>> Logging at maximum efficiency is a priority, but comes after these >>>>>>>> others. >>>>>>>> -- >>>>>>>> Jess Holle >
