Re: Location performance

Remko Popma Thu, 04 Jul 2019 21:22:38 -0700

The JMH samples 
(https://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/)
 are the best place to learn about the pitfalls of java micro benchmarks and 
how JMH can help you avoid them.


Remko.

(Shameless plug) Every java main() method deserves http://picocli.info

> On Jul 5, 2019, at 11:25, Ralph Goers <[email protected]> wrote:
> 
> I don’t understand what you mean with you last message.
> 
> I don’t really know the details of how JMH works but my understanding is it 
> does something to prevent things like dead code elimination. I haven’t been 
> able to find anything very specific about what JMH actually does though.
> 
> Ralph
> 
>> On Jul 4, 2019, at 11:15 AM, Matt Sicker <[email protected]> wrote:
>> 
>> And the checkpoint bias thing is alleviated amongst other noise from the
>> JVM that would otherwise invalidate the time measurements.
>> 
>>> On Thu, Jul 4, 2019 at 13:14, Matt Sicker <[email protected]> wrote:
>>> 
>>> I thought JMH ran your code enough for JIT to take effect (though that’s
>>> also up to your warmup iteration setting), and the
>>> 
>>> On Thu, Jul 4, 2019 at 00:21, Ralph Goers <[email protected]>
>>> wrote:
>>> 
>>>> That said, here are the results of the FileAppenderWithLocationBenchmark
>>>> from master.
>>>> 
>>>> Benchmark                                       Mode  Cnt       Score
>>>> Error  Units
>>>> FileAppenderWithLocationBenchmark.log4j1File   thrpt   20   27597.034 ±
>>>> 350.465  ops/s
>>>> FileAppenderWithLocationBenchmark.log4j2File   thrpt   20   82966.450 ±
>>>> 2023.826  ops/s
>>>> FileAppenderWithLocationBenchmark.log4j2RAF    thrpt   20   78820.010 ±
>>>> 1452.810  ops/s
>>>> FileAppenderWithLocationBenchmark.logbackFile  thrpt   20  138656.655 ±
>>>> 6329.118  ops/s
>>>> 
>>>> and here are the results with all the changes I have made.
>>>> 
>>>> Benchmark                                       Mode  Cnt       Score
>>>> Error  Units
>>>> FileAppenderWithLocationBenchmark.log4j1File   thrpt   20   47003.919 ±
>>>> 1594.591  ops/s
>>>> FileAppenderWithLocationBenchmark.log4j2File   thrpt   20  185129.606 ±
>>>> 2309.118  ops/s
>>>> FileAppenderWithLocationBenchmark.log4j2RAF    thrpt   20  183346.456 ±
>>>> 3103.615  ops/s
>>>> FileAppenderWithLocationBenchmark.logbackFile  thrpt   20  130121.612 ±
>>>> 2535.387  ops/s
>>>> 
>>>> Ralph
>>>> 
>>>> 
>>>>> On Jul 3, 2019, at 8:35 PM, Ralph Goers <[email protected]>
>>>> wrote:
>>>>> 
>>>>> Remko,
>>>>> 
>>>>> I ran the existing JHM benchmarks for both no location and with
>>>> location and there is still a huge difference. The problem is I don’t
>>>> necessarily believe the results. Nor do I think the benchmark I have been
>>>> using is totally accurate either. The problem is that, as I understand it,
>>>> JMH limits or eliminates much of the optimization the Java runtime will
>>>> perform. This is bad for Log4j because in some ways our code is written
>>>> inefficiently and hotspot will optimize it. OTOH, since the test is pretty
>>>> much logging the same record with an incremented value from the same
>>>> location it is hard to say whether the code isn’t being over optimized. I
>>>> suspect the real performance will be somewhere in the middle. What would be
>>>> better is something like a restful service with plenty of logging driven by
>>>> Jmeter once with no location and once with it.
>>>>> 
>>>>> Ralph
>>>>> 
>>>>>> On Jul 2, 2019, at 3:47 PM, Remko Popma <[email protected]> wrote:
>>>>>> 
>>>>>> If memory serves correctly, the `includeLocation` was introduced
>>>> because it was necessary for async loggers (and it seemed not feasible to
>>>> get this information from the layout pattern).
>>>>>> 
>>>>>> Async loggers need to know upfront whether they need to pass location
>>>> information to the consumer thread.
>>>>>> 
>>>>>> If obtaining location information becomes so cheap that it’s
>>>> essentially free, it is fine to ignore the `includeLocation` directive and
>>>> include location information anyway.
>>>>>> 
>>>>>> About the performance results: what’s the difference between “Log4j
>>>> 3.0 with class and method” and “Log4j 3.0 withLocation()”?
>>>>>> 
>>>>>> Also I am generally cautious of custom benchmarks; any reason not to
>>>> use JMH?
>>>>>> 
>>>>>> Overall impressive results, congratulations!
>>>>>> 
>>>>>> Remko.
>>>>>> 
>>>>>> (Shameless plug) Every java main() method deserves http://picocli.info
>>>>>> 
>>>>>>> On Jul 3, 2019, at 7:26, Ralph Goers <[email protected]>
>>>> wrote:
>>>>>>> 
>>>>>>> But what do you think about includeLocation. Should it override the
>>>> result of requiresLocation or should the fact that we determined location
>>>> information is required override the includeLocation setting?
>>>>>>> 
>>>>>>> Ralph
>>>>>>> 
>>>>>>>> On Jul 2, 2019, at 2:34 PM, Matt Sicker <[email protected]> wrote:
>>>>>>>> 
>>>>>>>> These benchmarks are impressive! Moving the point of call site
>>>> calculation is a great idea.
>>>>>>>> 
>>>>>>>> On Tue, 2 Jul 2019 at 16:22, Ralph Goers <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>>>>> I have made changes to the code the master branch to measure and
>>>> improve performance. I have committed some of the changes but not the last
>>>> set. I have not made the latest changes to release-2.x because they will
>>>> have to be implemented a bit differently to avoid compatibility issues. The
>>>> table below summarizes what I am seeing, at least on my MacBook Pro.  This
>>>> test just basically writes 1.5 million log statements in a loop using an
>>>> equivalent format for each logging framework. Note that JUL is faster with
>>>> location information than without primarily because a custom Handler had to
>>>> be created to format the log record and it performs better than the one
>>>> used to log without location information.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> There are several things I should note about this:
>>>>>>>> • The first step, which is in the PR, has StackLocator.calcLocation
>>>> modified to use a faster algorithm. Streams are not particularly fast. This
>>>> resulted in some gains not shown here but can be seen in
>>>> https://issues.apache.org/jira/browse/LOG4J2-2644 <
>>>> https://issues.apache.org/jira/browse/LOG4J2-2644>
>>>>>>>> • Performance is proportional to the number of stack traces that
>>>> need to be traversed. JUL only had 11 whereas Log4j had 35 before it called
>>>> calcLocation. Our stack is very deep because AbstractLogger and
>>>> LoggerConfig have been overly refactored along with having to go though
>>>> several intermediate objects.
>>>>>>>> • I introduced a requiredLocation() method throughout the stack.
>>>> Any component that requires location information sets this value to true.
>>>> Based on this, the call to calcLocation can be made significantly higher up
>>>> the stack just before the log event is created but only when a configured
>>>> component will use it - with a few exceptions. This code is not yet checked
>>>> in.
>>>>>>>> • If for some reason the location information isn’t calculated when
>>>> the event is constructed it can still be lazily populated later using the
>>>> current, slower mechanism.
>>>>>>>> 
>>>>>>>> As you can see these changes result in a pretty dramatic
>>>> improvement, almost to the point that the cost of obtaining location
>>>> information doesn’t really matter - especially in Java 11. The results in
>>>> Java 11 amount to 5us per logging call with no location vs 7.5us per
>>>> logging call with location info. These results could probably be improved
>>>> even more if AbstractLogger was “flattened” a bit.
>>>>>>>> 
>>>>>>>> I should point out that the logic isn’t perfect. I cannot determine
>>>> whether location information is required or not for the RoutingAppender
>>>> since it dynamically creates Appenders at run time. Instead, I added a
>>>> “requiresLocation” attribute that can be set to indicate that.
>>>>>>>> 
>>>>>>>> However, in testing I have encountered a unit test failure in
>>>> RandomAccessFileAppenderTest. It is failing because the location
>>>> information is being included even though includeLocation is set to false.
>>>> I am wondering if that setting needs to be respected since the pattern
>>>> specifies including it. It seems odd to me to not include it in that case.
>>>>>>>> 
>>>>>>>> Thoughts?
>>>>>>>> 
>>>>>>>> Ralph
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Matt Sicker <[email protected] <mailto:[email protected]>>
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> --
>>> Matt Sicker <[email protected]>
>>> 
>> -- 
>> Matt Sicker <[email protected]>
> 
>

Re: Location performance

Reply via email to