[ https://issues.apache.org/jira/browse/LOG4J2-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963955#comment-15963955 ]
Roman Leventov commented on LOG4J2-928: --------------------------------------- I've implemented mostly lock-free MemoryMappedFileManager. This change is local yet, because it depends on LOG4J2-1874. You can assign this issue to me. Comments on some questions raised above: > Needs the Unsafe. java.nio.ByteBuffer only provides relative, not absolute > bulk put operations. It is not needed, however it requires to copy data from local buffer to MappedByteBuffer in a presumably slow byte-by-byte manner: {code} private void copyToMappedBuffer(final byte[] bytes, int offset, int length, int bufferOffset) { MappedByteBuffer mappedBuffer = this.mappedBuffer; // As long as we cannot use sun.misc.Unsafe and even wrap bytes as ByteBuffer, and duplicate mappedBuffer // (because we don't want to create garbage), copy byte-by-byte is the only option left. for (int i = 0; i < length; i++) { mappedBuffer.put(bufferOffset + i, bytes[offset + i]); } } {code} > TBD: How many files/buffers to use? ... I agree that it is not needed, since Log4j is responsible only for appending logic. The change doesn't introduce background threads and preserves the current MemoryMappedFileManager's 32-MB buffer rotation logic. The actual rotation (and unmapping the previous buffer) is done exclusively by one of regular writer threads which "wins" this right via a compare-and-swap update of some variable. > Lock-free synchronous sub-microsecond appender > ---------------------------------------------- > > Key: LOG4J2-928 > URL: https://issues.apache.org/jira/browse/LOG4J2-928 > Project: Log4j 2 > Issue Type: New Feature > Components: Appenders > Reporter: Remko Popma > > _(This is a work in progress.)_ > *Goal* > It should be possible to create synchronous file appenders with (nearly) the > same performance as async loggers. > *Background* > The key to the async loggers performance is the lock-free queue provided by > the LMAX Disruptor. Multiple threads can add events to this queue without > contending on a lock. This means throughput scales linearly with the number > of threads: more threads = more throughput. > With a lock-based design, on the other hand, performance maxes out with a > single thread logging. Adding more threads does not help. In fact, total > logging throughput goes down slightly when multiple threads are logging (see > the Async Appenders in the [async performance > comparison|http://logging.apache.org/log4j/2.x/manual/async.html#Asynchronous_Throughput_Comparison_with_Other_Logging_Packages]). > Lock contention means that multiple threads together end up logging slower > than a single thread. > *Currently only async loggers are lock-free* > Log4j2 provides good performance with async loggers, but this approach has > several drawbacks: > * dependency on external LMAX disruptor library > * possibility of data loss: log events that have been put on the queue but > not flushed to disk yet may be lost in the event of an application crash > This ticket proposes a new feature to address these issues. > *Proposal: a lock-free synchronous appender* > For a single-threaded application the current MemoryMappedFileAppender has > performance comparable to Async Loggers (TODO: perf test). > However, the current implementation uses locks to control concurrency, and > suffers from lock contention in multi-threaded scenarios. > For inspiration for a lock-free solution, we can look at > [Aeron|https://github.com/real-logic/Aeron], specifically Aeron's design for > Log Buffers. Martin Thompson's September 2014 Strangeloop > [presentation|https://www.youtube.com/watch?v=tM4YskS94b0] gives details on > the design (especially the section 16:45-23:30 is relevant). > The way this works, is that instead of using locks, concurrency is handled > with a protocol where threads "reserve" blocks of memory atomically. Each > thread (having serialized the log event) knows how many bytes it wants to > write. It then atomically moves the buffer tail pointer by that many bytes > using a CAS operation. After the tail has been moved, the thread is free to > write the message payload bytes to the area of the buffer that it just > reserved, without needing to worry about other threads. Between threads, the > only point of contention is the tail pointer, which is similar to the > disruptor. We can reasonably expect performance to scale linearly with the > number threads, like async loggers. > *Still needs work* > This looks promising, but there are a few snags. > # Needs the Unsafe. {{java.nio.ByteBuffer}} only provides relative, not > absolute bulk put operations. That is, it only allows appending byte arrays > at the current cursor location, not at some user-specified absolute location. > The above design requires random access to be thread-safe. Aeron works around > this by using {{sun.misc.Unsafe}}. Users should be aware of this so they can > decide on whether the performance gain is worth the risk. Also, this may make > the OSGi folks unhappy (see LOG4J2-238 discussion)... Not sure how serious we > are about making Log4j2 work on OSGi, but perhaps it is possible to mark the > package for this feature as optional in the OSGi manifest. An alternative may > be to put this appender in a separate module. > # TBD: How many files/buffers to use? In his presentation Martin mentions > that using a single large memory mapped file will cause a lot of page faults, > page cache churn, and unspecified VM issues. He recommends cycling between > three smaller buffers, one active (currently written to), one dirty (full, > now being processed by a background thread) and one clean (to swap in when > the active buffer becomes full). I am not sure if the page fault problem will > occur for our use case: a Log4j appender is append-only, and there is no > separate thread or process reading this data at the same time. If it does, > and we decide on a similar design with three smaller buffers, we still need > to work out if these can be three different mapped regions in the same log > file, or if it is better to use a separate temporary file and copy from the > temporary file to the target log file in a background thread. I would prefer > to have a single file. Note that even with a single file we may want a > background thread for mapping a new region at every swap and occasionally > extending the file when the EOF is reached. > Feedback welcome. I intend to update this ticket as I learn more. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org For additional commands, e-mail: log4j-dev-h...@logging.apache.org