Gentlemen, what is the status of the progress on this front? Is it in a "good enough" state for the 2.15.0 release?
On Thu, Sep 23, 2021 at 6:29 PM Carter Kozak <cko...@ckozak.net> wrote: > Thanks, Ralph! > > You're correct that the direct encoderes are required for GC-free logging, > some of our consumers rely on that > for niche (low latency) workloads where it's important to understand > precisely where a pause may occur. > GC performance has become much better over the last several years, and the > throughput of the allocation- > heavy workflow has increased significantly, however there are a few things > we should keep in mind: > > 1. Java 8 performs better with direct encoders (although afaict the > performance isn't any better than java 9+, > string.getBytes just has several new optimizations > 2. If this[1] change is merged into openjdk, the direct encoders may once > again be the best-performing option, > I'm planning to put together a local build for comparison later. > 3. My testing has primarily targetted UTF-8 as it's the only charset we > target at work, and the default charset > on most modern servers. The jdk has a great deal of optimization which > special cases UTF-8 for this reason, > especially from string.getBytes. Other charsets are likely to perform > differently. > > I'm in favor of changing the default value of "direct.encoders" based on > the java version, for example > Java 8 should continue to use them, while versions after compact-strings > was introduced are better > off without direct encoders until the jdk can be improved using something > like the linked PR. > > 1. https://github.com/openjdk/jdk/pull/5621#issuecomment-925413767 > > -ck > > On Thu, Sep 23, 2021, at 11:51, Ralph Goers wrote: > > I ran Ceki’s benchmarks against 2.14.1, 2.15.0-SNAPSHOT and Carter’s > branch. I ran them with 16 threads on my > > MacBook Pro and ran each test twice since the tests show some > variability from time to time. > > > > I can draw 2 conclusions from this. > > 1. The improvements Carter has made to 2.15.0 have already made a > noticeable improvement. > > 2. His new changes show a definite improvement on top of the 2.15.0 > improvements when enable direct encoders is false. > > > > These results definitely make me wonder if we should just remove the > direct encoders logic at least in master. From what > > I can tell it makes things complicated and makes no noticeable impact on > performance in 2.15.0 even before Carter’s > > changes. Are they required for gc-free? Can we make direct.encoders = > false the default if it isn’t now? > > > > I apologize if the performance results don’t look good in your email. > They are formatted with a fixed-width font but it is > > likely they will be messed up once they reach your mail client. > > > > Ralph > > > > 2.14.1 - enable direct encoders = false > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1221.995 ± > 150.876 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1495.072 ± > 41.715 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1529.807 ± > 55.876 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 681.690 ± > 17.631 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1134.543 ± > 35.065 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1546.811 ± > 95.721 ops/ms > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1103.526 ± > 78.201 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1402.669 ± > 144.374 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1475.666 ± > 87.885 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 582.841 ± > 57.156 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 860.587 ± > 185.170 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1560.206 ± > 323.229 ops/ms > > > > 2.14.1 - enable direct encoders = true > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1250.808 ± > 116.848 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1564.926 ± > 146.195 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1703.731 ± > 183.304 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 736.189 ± > 40.798 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1070.958 ± > 97.173 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1350.507 ± > 225.104 ops/ms > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1218.639 ± > 141.343 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1525.081 ± > 112.058 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1724.925 ± > 169.680 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 680.453 ± > 32.330 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1054.517 ± > 52.147 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1353.180 ± > 224.415 ops/ms > > > > 2.15.0-SNAPSHOT - enable direct encoders = false > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1121.838 ± > 112.026 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1606.602 ± > 217.776 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1607.875 ± > 226.656 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 694.809 ± > 33.380 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1500.557 ± > 402.779 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1387.306 ± > 213.081 ops/ms > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1199.591 ± > 149.157 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1744.558 ± > 190.732 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1718.862 ± > 142.328 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 604.785 ± > 78.884 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1604.844 ± > 712.807 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1578.756 ± > 109.343 ops/ms > > > > 2.15.0-SNAPSHOT - enable direct encoders = true > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1244.029 ± > 61.842 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1564.244 ± > 263.322 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1648.783 ± > 217.936 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 717.024 ± > 43.631 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1513.402 ± > 438.311 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1425.745 ± > 231.201 ops/ms > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1207.924 ± > 136.763 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1514.119 ± > 98.725 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1620.834 ± > 238.498 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 716.294 ± > 65.527 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1523.430 ± > 387.178 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1402.748 ± > 312.713 ops/ms > > > > Carter 2.15.0-SNAPSHOT - enable direct encoders = false > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1351.481 ± > 68.356 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 2048.458 ± > 118.329 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1762.185 ± > 155.733 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 693.222 ± > 94.051 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1707.576 ± > 299.891 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1439.624 ± > 60.957 ops/ms > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1230.511 ± > 111.351 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1950.005 ± > 166.843 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1783.227 ± > 53.359 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 702.814 ± > 38.263 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1738.084 ± > 344.135 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1463.187 ± > 158.787 ops/ms > > > > Carter 2.15.0-SNAPSHOT - enable direct encoders = true > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1212.045 ± > 165.745 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1671.374 ± > 190.449 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1637.298 ± > 169.134 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 708.106 ± > 30.035 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1523.321 ± > 429.586 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1418.236 ± > 233.527 ops/ms > > > > Benchmark Mode Cnt Score > Error Units > > AsyncWithFileAppenderBenchmark.log4j1File thrpt 10 1189.572 ± > 117.298 ops/ms > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile thrpt 10 1499.633 ± > 186.132 ops/ms > > AsyncWithFileAppenderBenchmark.logbackFile thrpt 10 1631.038 ± > 214.951 ops/ms > > FileAppenderBenchmark.log4j1File thrpt 10 697.329 ± > 48.757 ops/ms > > FileAppenderBenchmark.log4j2File thrpt 10 1542.492 ± > 395.759 ops/ms > > FileAppenderBenchmark.logbackFile thrpt 10 1321.256 ± > 148.364 ops/ms > > > > > > > On Sep 22, 2021, at 12:30 AM, Volkan Yazıcı <vol...@yazi.ci> wrote: > > > > > > That's great news! Then I will be looking forward to your signal for > > > putting a ribbon onto 2.15.0. > > > > > > I have followed the conversation with Claes Redestad from Oracle on > Twitter > > > <https://twitter.com/carter_kozak/status/1433798391604162561>. My > > > conclusion was also that there apparently is no way to make > CharsetEncoder > > > beat .toString().getBytes() in Java 9+, until a CE specialization gets > > > introduced similar to what has been done in .toString().getBytes(). > > > Nevertheless, I was also (naively?) thinking about a similar branching > > > strategy like the one you proposed as a temporary solution: preJava9 ? > > > useCE() : useStringGetBytes(). Would you be able to get this done? > > > > > > Another idea I thought of yesterday evening was to introduce our own > > > hand-written StringBuilder- or char[]-to-byte[] encoders for common > cases, > > > i.e., ASCII and UTF-8. What do you think? > > > > > > > > > On Wed, Sep 22, 2021 at 3:48 AM Carter Kozak <cko...@ckozak.net> > wrote: > > > > > >> Thanks, Volkan! > > >> > > >> Rerunning the benchmarks on my branch (specifically the PatternLayout > > >> benchmarks in log4j-perf) produced much better improvements than I had > > >> anticipated. Beyond that, the date format cache invalidation problem > > >> resulted in a substantial speedup. I agree that it would be helpful > to get > > >> a release out the door once this is merged. > > >> > > >> Re getBytes vs CharsetEncoder, I don't want to use the unsafe hack I > put > > >> together in my benchmark project, that was just for experimentation > :-) > > >> Future java releases (or changes in minor patch releases) could cause > it to > > >> fail in frightening ways. We may be better off recommending the > getBytes > > >> approach for now on some java versions (possibly by changing our > default on > > >> java 9+). > > >> Claes has a potential change[1] which appears to buy us a great deal > of > > >> performance in future Javas (assuming it is merged) and we may be > able to > > >> engage for additional encoding APIs, for example something like this > could > > >> avoid allocations and additional buffers: > > >> > > >> /** Returns the number of characters encoded to destination, or -1 if > more > > >> space is needed (equivalent to CoderResult.OVERFLOW) */ > > >> int CharSetEncoder.encode(charsequence, inputStart, inputLimit, byte[] > > >> destination, int destOffset, int destLimit) > > >> (I haven't put a great deal of thought into this API and it's getting > > >> late, so pardon any terrible ideas!) > > >> > > >> 1. https://github.com/openjdk/jdk/pull/5621 > > >> > > >> -ck > > >> > > >> On Tue, Sep 21, 2021, at 15:31, Volkan Yazıcı wrote: > > >>> First and foremost, fantastic job Carter! > > >>> > > >>> For #573, I see that Gary and Ralph have already shared some > remarks. I > > >>> would appreciate it if we can get this merged and cut the ribbon for > > >> 2.15.0 > > >>> release. > > >>> > > >>> Regarding > `StringBuilder#toString().getBytes()`-vs-`CharsetEncoder`... > > >> That > > >>> is a tough one. In your hack branch there, I am not sure if using > > >> `Unsafe` > > >>> is a future-proof path forward. I was trying to wrap my mind around > > >> Daniel > > >>> Sun's fast-reflection < > https://github.com/danielsun1106/fast-reflection> > > >>> (for the records, I couldn't) and was triggered by his ASM usage > there. I > > >>> was curious if we could do something similar via ASM to hack > > >>> `CharsetEncoder`? (I am probably talking nonsense, I hope it anyway > > >>> triggers a practical idea for you.) > > >> > > > > >