[
https://issues.apache.org/jira/browse/LOG4J2-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142933#comment-14142933
]
Remko Popma commented on LOG4J2-506:
------------------------------------
I am more and more leaning towards using java.io.DataOutput rather than a
custom interface.
# The contract is not really an issue since we can simply follow the
java.io.DataOutput encoding contract in our implementation. (I slightly prefer
the ByteBuffer encoding for NaN, but I can live with the DataOutput encoding.)
Performance impact should be zero. The way ByteBuffer converts endian-ness
(Integer/Long.reverseBytes()) gets optimized by the JIT to an intrinsic which
is blazingly fast.
# DataOutput is widely used in the JDK and in vendor products, which makes it
easier for people to adopt BinaryMessage.
> Binary logging interface
> ------------------------
>
> Key: LOG4J2-506
> URL: https://issues.apache.org/jira/browse/LOG4J2-506
> Project: Log4j 2
> Issue Type: New Feature
> Components: Core, Layouts
> Reporter: Gregg Donovan
> Assignee: Remko Popma
> Priority: Minor
> Labels: binary
>
> From
> [discussion|https://groups.google.com/forum/#!topic/mechanical-sympathy/7HdRCacJCDk]
> with Remko Pompa on the mechanical-sympathy list:
> {quote}We -- Etsy.com -- have a request logging system that we've begun to
> outgrow due to increased log volume and increased size of the messages we
> need to log. The messages are Thrift-encoded search requests from 1-100K in
> size. We use the logs to feed our homegrown traffic replay tools for offline
> performance tests and to reproduce issues found in production (GC, perf,
> etc.).
> For our next gen request logging system we would like the following:
> Low contention -- Minimal lock contention despite log writes from many (10s
> to 100s) application threads.
> Async writes -- File I/O should be outside of the actual request path.
> Binary logging -- Our current system base64 encodes the payload as a String
> and logs it via log4j 1.2. ;-) This creates more garbage than we would like
> and makes the data written twice as large. It was good enough for a while,
> but now we need something more efficient...
> Compression -- Either per-message or per block of messages would be great.
> Incurring large periodic CPU penalties from logrotate + gzip, as we do now,
> is less than ideal.
> Rotation -- Ideally based on file size, but time-based would also do.
> {quote}
> Remko:
> {quote}
> You can do this with log4j-2.0. The current version (log4j-2.0-beta-9) out of
> the box gives you 3 out of 5: Async Loggers use the Disruptor under the hood
> -> low contention & async writes, and RollingRandomAccessFileAppender (the
> fastest appender) can do rotation. This appender can also optionally do
> compression in a separate thread, but this happens on roll-over only, not
> continuously. (Still, I would guess that this is more CPU-efficient than
> using logrotate + gzip in a separate process...)
> The binary logging can be achieved with a bit of custom code.
> First, your objects would need to provide some way to turn themselves into
> bytes. I doubt that you want to use java standard serialization :-), so how
> about this as an interface to log4j that your objects would implement:
> {quote}
> {code:java}
> public interface BinaryMessage extends
> org.apache.logging.log4j.message.Message {
> int getLength(); // return byte count
> void writeInto(ByteBuffer buffer); // write length bytes from the current
> buffer position
> }
> {code}
> {quote}
> Log4j2 Loggers already have methods that take a Message argument (instead of
> a String), so in your application code you could just write
> {quote}
> {code:java}
> Logger logger = LogManager.getLogger(MyClass.class); // usually cached in
> static field
> logger.info(myObject); // where myObject implements BinaryMessage
> {code}
> {quote}
> Second, in the log4j2 framework, the Layout is responsible for converting a
> LogEvent into bytes, so you need a BinaryLayout that recognizes
> BinaryMessages. The layout could optionally output the timestamp (long) and
> log level (byte), followed by the message size (int) and the message bytes
> (byte[]).
> Using the log4j2 plugin model, an implementation could look something like
> below.
> {quote}
> {code:java}
> @Plugin(name = "BinaryLayout", category = "Core", elementType = "layout",
> printObject = true)
> public final class BinaryLayout extends AbstractLayout<LogEvent> {
> private static enum TimestampStrategy {
> WRITE {
> public void write(LogEvent event, ByteBuffer buffer) {
> buffer.putLong(event.getMillis());
> }
> },
> DONT {
> public void write(LogEvent event, ByteBuffer buffer) {
> }
> };
> abstract void write(LogEvent event, ByteBuffer buffer);
> }
> private static enum LevelStrategy {
> WRITE {
> public void write(LogEvent event, ByteBuffer buffer) {
> buffer.put((byte) event.getLevel().intLevel());
> }
> },
> DONT {
> public void write(LogEvent event, ByteBuffer buffer) {
> }
> };
> abstract void write(LogEvent event, ByteBuffer buffer);
> }
> private final TimestampStrategy timestamp;
> private final LevelStrategy level;
> private final int prefixLength;
> public BinaryLayout(boolean timestamp, boolean level) {
> this.timestamp = timestamp ? TimestampStrategy.WRITE :
> TimestampStrategy.DONT;
> this.level = level ? LevelStrategy.WRITE : LevelStrategy.DONT;
> final int stampBytes = timestamp ? 8 : 0;
> final int levelBytes = level ? 1 : 0;
> prefixLength = 4 + stampBytes + levelBytes;
> }
> @Override
> public byte[] toByteArray(LogEvent event) {
> final BinaryMessage msg = (BinaryMessage) event.getMessage();
> final byte[] result = new byte[msg.getLength() + prefixLength];
> final ByteBuffer buffer = ByteBuffer.wrap(result);
> timestamp.write(event, buffer);
> level.write(event, buffer);
> buffer.putInt(msg.getLength());
> msg.writeInto(buffer);
> return result;
> }
> @Override
> public LogEvent toSerializable(LogEvent event) {
> return event;
> }
> @Override
> public String getContentType() {
> return "application/octet-stream";
> }
> @Override
> public Map<String, String> getContentFormat() {
> return new HashMap<String, String>();
> }
> @PluginFactory
> public static BinaryLayout createLayout(
> @PluginAttribute("writeTimestamp") final String writeTimestamp,
> @PluginAttribute("writeLevel") final String writeLevel) {
> final boolean timestamp = Booleans.parseBoolean(writeTimestamp,
> false);
> final boolean level = Booleans.parseBoolean(writeLevel, false);
> return new BinaryLayout(timestamp, level);
> }
> }
> {code}
> {quote}
> Unfortunately the log4j API does not allow us to be completely garbage-free,
> you still need to allocate the byte[] array and ByteBuffer for each event...
> (The LogEvents are reused Disruptor ring buffer slots if you make all loggers
> AsyncLoggers.)
> Still, this should be a lot better (garbage-wise) than base64-encoding a
> String, and then turning that String into bytes again...
> Here is the log4j2.xml configuration that would put it all together (modify
> the "packages" attribute at the top to match your package):
> (Don't forget to set system property
> -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
> to make all loggers AsyncLoggers.)
> {quote}
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <Configuration status="WARN" packages="package.of.binary.layout.class">
> <Appenders>
> <RollingRandomAccessFile name="rraf" fileName="binary.log"
>
> filePattern="logs/$${date:yyyy-MM}/app-%d{MM-dd-yyyy}-%i.log.gz"
> append="false" immediateFlush="false">
> <BinaryLayout writeTimestamp="true" writeLevel="true" />
> <Policies>
> <TimeBasedTriggeringPolicy />
> <SizeBasedTriggeringPolicy size="250 MB"/>
> </Policies>
> </RollingRandomAccessFile>
> </Appenders>
>
> <Loggers>
> <Root level="trace" includeLocation="false">
> <AppenderRef ref="rraf"/>
> </Root>
> </Loggers>
> </Configuration>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]