Gentlemen, what is the status of the progress on this front? Is it in a
"good enough" state for the 2.15.0 release?

On Thu, Sep 23, 2021 at 6:29 PM Carter Kozak <cko...@ckozak.net> wrote:

> Thanks, Ralph!
>
> You're correct that the direct encoderes are required for GC-free logging,
> some of our consumers rely on that
> for niche (low latency) workloads where it's important to understand
> precisely where a pause may occur.
> GC performance has become much better over the last several years, and the
> throughput of the allocation-
> heavy workflow has increased significantly, however there are a few things
> we should keep in mind:
>
> 1. Java 8 performs better with direct encoders (although afaict the
> performance isn't any better than java 9+,
>     string.getBytes just has several new optimizations
> 2. If this[1] change is merged into openjdk, the direct encoders may once
> again be the best-performing option,
>     I'm planning to put together a local build for comparison later.
> 3. My testing has primarily targetted UTF-8 as it's the only charset we
> target at work, and the default charset
>     on most modern servers. The jdk has a great deal of optimization which
> special cases UTF-8 for this reason,
>     especially from string.getBytes. Other charsets are likely to perform
> differently.
>
> I'm in favor of changing the default value of "direct.encoders" based on
> the java version, for example
> Java 8 should continue to use them, while versions after compact-strings
> was introduced are better
> off without direct encoders until the jdk can be improved using something
> like the linked PR.
>
> 1. https://github.com/openjdk/jdk/pull/5621#issuecomment-925413767
>
> -ck
>
> On Thu, Sep 23, 2021, at 11:51, Ralph Goers wrote:
> > I ran Ceki’s benchmarks against 2.14.1, 2.15.0-SNAPSHOT and Carter’s
> branch. I ran them with 16 threads on my
> > MacBook Pro and ran each test twice since the tests show some
> variability from time to time.
> >
> > I can draw 2 conclusions from this.
> > 1. The improvements Carter has made to 2.15.0 have already made a
> noticeable improvement.
> > 2. His new changes show a definite improvement on top of the 2.15.0
> improvements when enable direct encoders is false.
> >
> > These results definitely make me wonder if we should just remove the
> direct encoders logic at least in master. From what
> > I can tell it makes things complicated and makes no noticeable impact on
> performance in 2.15.0 even before Carter’s
> > changes. Are they required for gc-free? Can we make direct.encoders =
> false the default if it isn’t now?
> >
> > I apologize if the performance results don’t look good in your email.
> They are formatted with a fixed-width font but it is
> > likely they will be messed up once they reach your mail client.
> >
> > Ralph
> >
> > 2.14.1 - enable direct encoders = false
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1221.995 ±
> 150.876  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1495.072 ±
> 41.715  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1529.807 ±
> 55.876  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   681.690 ±
> 17.631  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1134.543 ±
> 35.065  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1546.811 ±
> 95.721  ops/ms
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1103.526 ±
> 78.201  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1402.669 ±
> 144.374  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1475.666 ±
> 87.885  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   582.841 ±
> 57.156  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10   860.587 ±
> 185.170  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1560.206 ±
> 323.229  ops/ms
> >
> > 2.14.1 - enable direct encoders = true
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1250.808 ±
> 116.848  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1564.926 ±
> 146.195  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1703.731 ±
> 183.304  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10  736.189 ±
>  40.798  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1070.958 ±
> 97.173  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1350.507 ±
> 225.104  ops/ms
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1218.639 ±
> 141.343  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1525.081 ±
> 112.058  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1724.925 ±
> 169.680  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   680.453 ±
> 32.330  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1054.517 ±
> 52.147  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1353.180 ±
> 224.415  ops/ms
> >
> > 2.15.0-SNAPSHOT - enable direct encoders = false
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1121.838 ±
> 112.026  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1606.602 ±
> 217.776  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1607.875 ±
> 226.656  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10  694.809 ±
> 33.380   ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1500.557 ±
> 402.779  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1387.306 ±
> 213.081  ops/ms
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1199.591 ±
> 149.157 ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1744.558 ±
> 190.732 ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1718.862 ±
> 142.328 ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   604.785 ±
> 78.884  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1604.844 ±
> 712.807 ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1578.756 ±
> 109.343 ops/ms
> >
> > 2.15.0-SNAPSHOT - enable direct encoders = true
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1244.029 ±
> 61.842  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1564.244 ±
> 263.322  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1648.783 ±
> 217.936  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10  717.024 ±
> 43.631   ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1513.402 ±
> 438.311  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1425.745 ±
> 231.201  ops/ms
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1207.924 ±
> 136.763  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1514.119 ±
> 98.725  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1620.834 ±
> 238.498  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   716.294 ±
> 65.527  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1523.430 ±
> 387.178  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1402.748 ±
> 312.713  ops/ms
> >
> > Carter 2.15.0-SNAPSHOT - enable direct encoders = false
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1351.481 ±
> 68.356  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  2048.458 ±
> 118.329  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1762.185 ±
> 155.733  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   693.222 ±
> 94.051  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1707.576 ±
> 299.891  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1439.624 ±
> 60.957  ops/ms
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1230.511 ±
> 111.351  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1950.005 ±
> 166.843  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1783.227 ±
> 53.359  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   702.814 ±
> 38.263  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1738.084 ±
> 344.135  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1463.187 ±
> 158.787  ops/ms
> >
> > Carter 2.15.0-SNAPSHOT - enable direct encoders = true
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1212.045 ±
> 165.745  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1671.374 ±
> 190.449  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1637.298 ±
> 169.134  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   708.106 ±
> 30.035  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1523.321 ±
> 429.586  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1418.236 ±
> 233.527  ops/ms
> >
> > Benchmark                                        Mode  Cnt    Score
> Error  Units
> > AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1189.572 ±
> 117.298  ops/ms
> > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1499.633 ±
> 186.132  ops/ms
> > AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1631.038 ±
> 214.951  ops/ms
> > FileAppenderBenchmark.log4j1File                thrpt  10   697.329 ±
> 48.757  ops/ms
> > FileAppenderBenchmark.log4j2File                thrpt  10  1542.492 ±
> 395.759  ops/ms
> > FileAppenderBenchmark.logbackFile               thrpt  10  1321.256 ±
> 148.364  ops/ms
> >
> >
> > > On Sep 22, 2021, at 12:30 AM, Volkan Yazıcı <vol...@yazi.ci> wrote:
> > >
> > > That's great news! Then I will be looking forward to your signal for
> > > putting a ribbon onto 2.15.0.
> > >
> > > I have followed the conversation with Claes Redestad from Oracle on
> Twitter
> > > <https://twitter.com/carter_kozak/status/1433798391604162561>. My
> > > conclusion was also that there apparently is no way to make
> CharsetEncoder
> > > beat .toString().getBytes() in Java 9+, until a CE specialization gets
> > > introduced similar to what has been done in .toString().getBytes().
> > > Nevertheless, I was also (naively?) thinking about a similar branching
> > > strategy like the one you proposed as a temporary solution: preJava9 ?
> > > useCE() : useStringGetBytes(). Would you be able to get this done?
> > >
> > > Another idea I thought of yesterday evening was to introduce our own
> > > hand-written StringBuilder- or char[]-to-byte[] encoders for common
> cases,
> > > i.e., ASCII and UTF-8. What do you think?
> > >
> > >
> > > On Wed, Sep 22, 2021 at 3:48 AM Carter Kozak <cko...@ckozak.net>
> wrote:
> > >
> > >> Thanks, Volkan!
> > >>
> > >> Rerunning the benchmarks on my branch (specifically the PatternLayout
> > >> benchmarks in log4j-perf) produced much better improvements than I had
> > >> anticipated. Beyond that, the date format cache invalidation problem
> > >> resulted in a substantial speedup. I agree that it would be helpful
> to get
> > >> a release out the door once this is merged.
> > >>
> > >> Re getBytes vs CharsetEncoder, I don't want to use the unsafe hack I
> put
> > >> together in my benchmark project, that was just for experimentation
> :-)
> > >> Future java releases (or changes in minor patch releases) could cause
> it to
> > >> fail in frightening ways. We may be better off recommending the
> getBytes
> > >> approach for now on some java versions (possibly by changing our
> default on
> > >> java 9+).
> > >> Claes has a potential change[1] which appears to buy us a great deal
> of
> > >> performance in future Javas (assuming it is merged) and we may be
> able to
> > >> engage for additional encoding APIs, for example something like this
> could
> > >> avoid allocations and additional buffers:
> > >>
> > >> /** Returns the number of characters encoded to destination, or -1 if
> more
> > >> space is needed (equivalent to CoderResult.OVERFLOW) */
> > >> int CharSetEncoder.encode(charsequence, inputStart, inputLimit, byte[]
> > >> destination, int destOffset, int destLimit)
> > >> (I haven't put a great deal of thought into this API and it's getting
> > >> late, so pardon any terrible ideas!)
> > >>
> > >> 1. https://github.com/openjdk/jdk/pull/5621
> > >>
> > >> -ck
> > >>
> > >> On Tue, Sep 21, 2021, at 15:31, Volkan Yazıcı wrote:
> > >>> First and foremost, fantastic job Carter!
> > >>>
> > >>> For #573, I see that Gary and Ralph have already shared some
> remarks. I
> > >>> would appreciate it if we can get this merged and cut the ribbon for
> > >> 2.15.0
> > >>> release.
> > >>>
> > >>> Regarding
> `StringBuilder#toString().getBytes()`-vs-`CharsetEncoder`...
> > >> That
> > >>> is a tough one. In your hack branch there, I am not sure if using
> > >> `Unsafe`
> > >>> is a future-proof path forward. I was trying to wrap my mind around
> > >> Daniel
> > >>> Sun's fast-reflection <
> https://github.com/danielsun1106/fast-reflection>
> > >>> (for the records, I couldn't) and was triggered by his ASM usage
> there. I
> > >>> was curious if we could do something similar via ASM to hack
> > >>> `CharsetEncoder`? (I am probably talking nonsense, I hope it anyway
> > >>> triggers a practical idea for you.)
> > >>
> >
> >
>

Reply via email to