Sorry I missed replying to your question, Ralph!

> Did you add the no-op appender to Ceki’s project? Or are you using our 
> NullAppender? I have 
> my doubts about using that NullAppender as it does nothing - not even render 
> the Layout, so 
> it may get completely optimized away.

I added a no-op appender locally and ran the benchmark, and used the logback 
null-appender which also does not read event values. I agree that this test 
would be completely invalid using synchronous logging, however using the 
asynchronous approach I don't believe the JVM can optimize out the application 
thread pieces. There's definitely some ambiguity in the data -- I also tested 
using a wrapper around a file appender with a 1/1000 probability of writing a 
given event to the file appender and saw similar results, so I believe the test 
was at least somewhat valid.

-ck

On Thu, Aug 26, 2021, at 10:29, Carter Kozak wrote:
> I also tried that with similar results, which leads me to believe we're 
> spending
> too much time copying between buffers.
> 
> We've proven that log4j can get LogEvents to the appender very quickly and
> efficiently.
> 
> Once we hit PatternLayout we write data to a StringBuilder. 
> AbstractStringBuilder used to store a char array in java 8, however with jep 
> 254 compact strings it was updated (similarly to String) to use a byte array 
> + byte representing the encoding.
> We encode the StringBuilder value into a ByteBufferDestination 
> (Heap)ByteBuffer in StringBuilderEncoder using the following steps:
> 1. copy the StringBuilder value into a CharBuffer (this would have been a 
> direct copy from char[] to char[] in java8, but requires conversion from 
> byte[]+coder now)
> 2. encode data from the CharBuffer to the StringBuilderEncoder ByteBuffer
> 3. copy contents of the StringBuilderEncoder ByteBuffer to the 
> ByteBufferDestination ByteBuffer (which is also always a heap buffer as far 
> as I can tell)
> 4. Write the ByteBufferDestination to the FileOutputStream/RandomAccessFile 
> by extracting the heap byte-array.
> 
> We're copying bytes around a lot, and I think older JVMs would have optimized 
> out a bit of the copying because the types exactly matched.
> It's unclear why we need to copy through two separate ByteBuffers from 
> StringBuilderEncoder into ByteBufferDestination
> I've generally had better I/O performance using direct byte-buffers than heap 
> buffers, although that experience largely comes from work on webservers, I 
> haven't tested if it holds true for file I/O. If we operate directly upon a 
> FileChannel, we can write direct buffers directly to files. The 
> DirectByteBuffer memory is guaranteed to be contiguous, so it should have 
> better performance around copying and manipulation as well.
> 
> -ck
> 
> On Thu, Aug 26, 2021, at 10:00, Volkan Yazıcı wrote:
> > Ralph, maybe a dumb idea but... Can't we simply write to /dev/null to avoid
> > hardware's influence in the test results?
> > 
> > On Wed, Aug 25, 2021 at 8:05 PM Ralph Goers <[email protected]>
> > wrote:
> > 
> > > Did you add the no-op appender to Ceki’s project? Or are you using our
> > > NullAppender? I have
> > > my doubts about using that NullAppender as it does nothing - not even
> > > render the Layout, so
> > > it may get completely optimized away.
> > >
> > > I’d still like to know what kind of disks Remko did his original testing.
> > > The page says he got
> > > 18.495 million messages per second. Each message would be about 130
> > > characters long
> > > from I can tell. That means the destination has to be able to support
> > > about at least 2.4 GB
> > > per second.
> > >
> > > In Ceki’s test the message is only 23 characters long (vs 100 in Remko’s).
> > > That means the
> > > message size is only going to be about 60 characters long. For me the
> > > async test is writing
> > > at about 1700 ops/ms which equates to 102 MBs, yet I am still seeing the
> > > queue backed up.
> > > So how we are writing must be terribly inefficient. Logback is getting
> > > about 2000 ops/ms,
> > > which is still only 120 MBs.
> > >
> > > Your test below would be generating about 197 MBs.
> > >
> > > One thing I did notice that made a big difference is that Ceki has Logback
> > > configured to
> > > use a buffer size of 256K while Log4j2 uses the default of 8K. Setting
> > > Log4j2 to 256K
> > > considerably improved the performance.
> > >
> > >
> > > Ralph
> > >
> > > > On Aug 25, 2021, at 8:42 AM, Carter Kozak <[email protected]> wrote:
> > > >
> > > >> If we would move the appender performance aside, am I right to conclude
> > > >> that the entire complex async. framework built atop Disruptor with all
> > > its
> > > >> whistles and bells is yielding a diminishing performance gain compared
> > > to a
> > > >> simple off the shelf blocking queue?
> > > >
> > > > I haven't seen any data that would give me such an impression -- the
> > > file appender
> > > > itself is slower than the thread producing events, which appears to
> > > limit our throughput.
> > > > Even then, log4j2 does much better in my testing on a linux workstation
> > > with 20+ producer
> > > > threads compared to logback.
> > > > Any time the queue is constantly full, performance will suffer (and this
> > > applies to standard
> > > > blocking queues as well as disruptor, however disruptor will fare worse
> > > when the queue
> > > > is full).
> > > > The results using a single thread are roughly reproducible on the
> > > non-async test, so I think
> > > > the problem is somewhere between the layout, and appender/manager.
> > > >
> > > > Results running locally with no-op appender implementations:
> > > >
> > > > 1 thread:
> > > >
> > > > Benchmark                                        Mode  Cnt     Score
> > >  Error   Units
> > > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt    4  3285.743 ±
> > > 427.963  ops/ms
> > > > AsyncWithFileAppenderBenchmark.logbackFile      thrpt    4  2383.532 ±
> > > 136.034  ops/ms
> > > >
> > > > 20 threads:
> > > >
> > > > Benchmark                                        Mode  Cnt     Score
> > >  Error   Units
> > > > AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt    4  1177.746 ±
> > > 919.560  ops/ms
> > > > AsyncWithFileAppenderBenchmark.logbackFile      thrpt    4   602.614 ±
> > > 47.485  ops/ms
> > >
> > >
> > >
> > 
> 

Reply via email to