On 10/03/2016 08:43 AM, Radim Vansa wrote: > Aren't we investing a bit too much time to this? The trace level was > cached for ages, has anyone ever complained?
I have. https://issues.jboss.org/browse/JBCACHE-1625 Looks like I just never copied it over to the ISPN JIRA. :) > Turning on trace level is completely impractical in production, We do it all the time when there's no better option to debug an issue. (generally for as short a period as possible if it's production) -Dennis > but that info (or, I'd say 80% of > that) is very useful when debugging race conditions etc., so it makes > sense to keep it there and reduce the cost to minimum. > > My 2c > > Radim > > On 10/03/2016 03:25 PM, Dan Berindei wrote: >> On Mon, Oct 3, 2016 at 3:58 PM, Sanne Grinovero <sa...@infinispan.org> wrote: >>> I've not looked at any such details but I always assumed that since the >>> logger frameworks (any of them) need to be able to change the logging level, >>> there must be some way of switching state - at least a volatile read? >>> >>> And while that could be very cheap, I doubt it would be as easily optimised >>> as a constant. >>> >> True, at some point you need a volatile read -- you can only avoid >> that by caching isTraceEnabled(), like we do in Infinispan. >> >> My point was related to this assembly instruction, which reads the >> ordinal of Level.TRACE: >> >> 0x00007f93951e9fdf: mov rdi,0x76d4e6d48 ; {oop(a >> 'org/jboss/logging/Logger$Level')} >> 0x00007f93951e9fe9: mov ebp << missing an operand here >>; - >> java.lang.Enum::ordinal@1 (line 103) >> ; - >> org.jboss.logging.JBossLogManagerLogger::translate@8 (line 58) >> ; - >> org.jboss.logging.JBossLogManagerLogger::isEnabled@5 (line 35) >> ; - org.jboss.logging.Logger::tracef@4 (line >> 287) >> ; - org.infinispan.MyBenchmark::tracef@7 (line >> 26) >> ; - >> org.infinispan.generated.MyBenchmark_tracef_jmhTest::tracef_thrpt_jmhStub@15 >> (line 113) >> >> In an ideal world, the JIT would replace translate(Level.TRACE) with a >> constant. But here, even though `0x76d4e6d48` is clearly a constant, >> the ordinal is still loaded and checked every time. >> >>> Since we might not be interested only in the logger method to be inlined, we >>> also want most of the client methods to be considered for inlining.. some of >>> these can be extremely sensitive so such a decision requires a full >>> Infinispan performance test rather than a microbenchmark. (which is useful >>> too but more for the sake of optimising JBoss Logging, not to proof if it's >>> OK to switch all of Infinispan 's internal conventions) >>> >> Based on the testing both myself and Radim did WRT inlining, it >> doesn't make a huge difference. But I agree 100% that before removing >> the isTraceEnabled() caching we'd have to benchmark the whole of >> Infinispan, and not just JBoss Logging in isolation. >> >> Cheers >> Dan >> >>> On Mon, 3 Oct 2016, 13:45 Sebastian Laskawiec, <slask...@redhat.com> wrote: >>>> Awesome Dan! May I ask for a Pull Request: >>>> https://github.com/slaskawi/jboss-logging-perf-test? >>>> >>>> >>>> >>>> On Mon, Oct 3, 2016 at 2:40 PM, Dan Berindei <dan.berin...@gmail.com> >>>> wrote: >>>>> Hi Sebastian >>>>> >>>>> I modified your benchmark so that the logger and the trace field are >>>>> static final and looked at the generated assembly with JITWatch [1]. >>>>> Based on the attached assembly listings, caching isTraceEnabled() in a >>>>> constant field is "infinitely faster", because there are no assembly >>>>> instructions that can be traced back to the test method. >>>>> >>>>> JBossLogManagerLogger::translate() is inlined in this listing, but it >>>>> still goes through the switch machinery, I'm guessing because the >>>>> ordinal of an Enum is not considered a constant. >>>>> >>>>> Cheers >>>>> Dan >>>>> >>>>> >>>>> [1]: https://github.com/AdoptOpenJDK/jitwatch >>>>> >>>>> On Mon, Oct 3, 2016 at 11:28 AM, Sebastian Laskawiec >>>>> <slask...@redhat.com> wrote: >>>>>> Hey! >>>>>> >>>>>> Please have a look at the latest perf test results [1][2]: >>>>>> >>>>>> Benchmark Mode Cnt Score >>>>>> Error Units >>>>>> MyBenchmark.noVariable thrpt 20 681131269.875 ± >>>>>> 3961932.923 ops/s >>>>>> MyBenchmark.withIsTraceEnabledCheck thrpt 20 676307984.921 ± >>>>>> 14305970.393 ops/s >>>>>> MyBenchmark.withVariable thrpt 20 2411000894.582 ± >>>>>> 17382438.635 ops/s >>>>>> >>>>>> I think there is no surprise here.. using a field, which stores the >>>>>> result >>>>>> of `logger.isTraceEnabled()` evaluation is 3 times faster than other >>>>>> options. >>>>>> >>>>>> If anyone is interested in printing out JIT stuff, I also ran it with >>>>>> "-XX:+PrintCompilation", "-XX:+PrintCompilation2" and >>>>>> "-XX:+PrintInlining" >>>>>> here [3]. >>>>>> >>>>>> I'm not a performance expert but it seems that JIT could not inline the >>>>>> "translate" method because of its size (see line 1861). However it >>>>>> tried >>>>>> several times with different optimizations (and some of them were >>>>>> thrown >>>>>> away - "made not entrant" messages [4]). >>>>>> >>>>>> Let's wait for James' opinion on this, but I think we should address >>>>>> this >>>>>> issue on JBoss Logging/LogManager side (so I agree with David here) and >>>>>> make >>>>>> those parts inlinable (wow, does this word even exist? :D). Once this >>>>>> is >>>>>> done, we could experiment further in Infinispan codebase and see how >>>>>> this >>>>>> relates to some real world benchmarks... >>>>>> >>>>>> Thanks >>>>>> Sebastian >>>>>> >>>>>> [1] https://gist.github.com/slaskawi/6766b6e17c7a28ac8d8962293c48a53c >>>>>> [2] Test repository: >>>>>> https://github.com/slaskawi/jboss-logging-perf-test >>>>>> [3] https://gist.github.com/slaskawi/6f317bb05539611434bc91d66924bae0 >>>>>> [4] >>>>>> >>>>>> https://www.safaribooksonline.com/library/view/java-performance-the/9781449363512/ch04.html >>>>>> >>>>>> On Mon, Oct 3, 2016 at 7:32 AM, Sebastian Laskawiec >>>>>> <slask...@redhat.com> >>>>>> wrote: >>>>>>> Hey! >>>>>>> >>>>>>> Adding James to this thread. >>>>>>> >>>>>>> @Dennis - I think Dan has a point here. The trick with checking a >>>>>>> field in >>>>>>> a class is 3 times faster. Most of the checks are done in core so they >>>>>>> are >>>>>>> executed multiple times per operation. Changing all those places is >>>>>>> probably >>>>>>> not an option. >>>>>>> >>>>>>> @David - Let me run a test with JBoss Log Manager and get back to you >>>>>>> with >>>>>>> some results. But if Dan is right, and the problem is with enum >>>>>>> mapping, I >>>>>>> will get similar results. >>>>>>> >>>>>>> Thanks >>>>>>> Sebastian >>>>>>> >>>>>>> On Sat, Oct 1, 2016 at 11:53 AM, Dan Berindei <dan.berin...@gmail.com> >>>>>>> wrote: >>>>>>>> On Fri, Sep 30, 2016 at 10:15 PM, David M. Lloyd >>>>>>>> <david.ll...@redhat.com> >>>>>>>> wrote: >>>>>>>>> The performance problem that this trick is meant to resolve is >>>>>>>>> really a >>>>>>>>> problem in the logging backend. It *should* be faster inside of >>>>>>>>> WildFly, where JBoss LogManager is used, because that project just >>>>>>>>> checks a single volatile field for the level check... and the path >>>>>>>>> to >>>>>>>>> that code *should* be inline-friendly. >>>>>>>>> >>>>>>>> Indeed, we started using this trick because of log4j 1.2, which needs >>>>>>>> to walk the logger hierarchy in order to check the level, and it made >>>>>>>> a significant difference there. >>>>>>>> >>>>>>>> Nowadays I think it's pretty close to optimal in all logging >>>>>>>> frameworks. The only nitpick is that they all use enums for the >>>>>>>> levels, and the JIT can't inline Level.TRACE.value as it would with a >>>>>>>> Level.TRACE_VALUE int constant. If JDK9 fixes that, then it's going >>>>>>>> to >>>>>>>> be more or less equivalent to using a volatile "trace" field in each >>>>>>>> class, so it should be suitable even for local read operations that >>>>>>>> take < 200ns. >>>>>>>> >>>>>>>> We'd probably still need to weed out some of the trace messages, as >>>>>>>> we >>>>>>>> probably have more than 10 of them during such a read operation. I >>>>>>>> confess that I added way too many trace logs myself, precisely >>>>>>>> because >>>>>>>> I knew we are using a static final field and the JIT compiler doesn't >>>>>>>> even generate code for that branch. >>>>>>>> >>>>>>>> Cheers >>>>>>>> Dan >>>>>>>> _______________________________________________ >>>>>>>> infinispan-dev mailing list >>>>>>>> infinispan-dev@lists.jboss.org >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev@lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev@lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev@lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev