[ 
https://issues.apache.org/jira/browse/LOG4J2-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remko Popma updated LOG4J2-1274:
--------------------------------
    Description: 
*Problem*
The current Layout API does not make it easy for implementors to avoid creating 
temporary objects. Especially these methods:
{code}
byte[] toByteArray(LogEvent);
T toSerializable(LogEvent);
{code}

The byte array returned from {{toByteArray(LogEvent)}} cannot be re-used 
between log events, since the caller cannot know how many bytes a partially 
filled array contains.

In practice, all Layout implementations in Log4j 2 except SerializedLayout 
implement the {{StringLayout}} subinterface. This means that the 
{{toSerializable()}} method needs to return a new String object for every log 
event.

*Forces*
I am interested in reducing or even eliminating the allocation of temporary 
objects for text-based layouts. Many of these use (and re-use) a StringBuilder 
to build a text representation of the current log event. Once this text 
representation is built, it needs to be converted to bytes that the Appender 
can consume. I am aware of two ways in the JDK to convert text to bytes:
* the various {{String#getBytes}} methods - these all allocate a new byte array 
for each invocation
* the underlying {{java.nio.charset.CharsetEncoder}} used internally by String 
- especially method {{CoderResult encode(CharBuffer in, ByteBuffer out, boolean 
endOfInput)}} which converts characters to bytes without object allocation.

The last method is interesting because this gives us an opportunity to also 
reduce the amount of copying by directly supplying the ByteBuffer buffer used 
by RandomAccessFileAppender, or the MappedByteBuffer of the 
MemoryMappedFileAppender.

The resulting API needs to support the fact that implementations may need to 
call {{CharsetEncoder#encode}} multiple times:
* The ByteBuffer may not have enough remaining space to hold all the data; 
{{CharsetEncoder#encode}} returns {{CoderResult.OVERFLOW}} to signal this so 
the caller can consume the contents and reset/clear the buffer before 
continuing.
* The CharBuffer may not be large enough to hold the full text representation 
of the log event. Again, {{CharsetEncoder#encode}} may need to be invoked 
multiple times.

*Proposal*
(Thinking out loud here, I'm open to suggestions.)
It may be sufficient for the layout interface to have a single method where 
callers can specify the destination StringBuilder.
{code}
public interface TextLayout {
    /**
     * Creates a text representation of the specified log event
     * and writes it into the specified StringBuilder.
     * <p>
     * Implementations are free to return a new StringBuilder if they can
     * detect in advance that the specified StringBuilder is too small.
     */
    StringBuilder toText(LogEvent e, StringBuilder destination);
}
{code}

Appenders that use ByteBuffers can implement this interface:
{code}
public interface BinaryDestination {
    ByteBuffer getByteBuffer();

    /**
     * Consumes the buffer content and clear()s it.
     */
    void drain(ByteBuffer buf);
}
{code}

Then Appender code can look like this, delegating all the work to a helper:
{code}
// appender
public void append(final LogEvent event) {
    if (getLayout() instanceof TextLayout) {
        TextEncoderHelper helper = ... // get cached helper
        helper.appendWithoutAllocation(event, (TextLayout) getLayout(), 
(BinaryDestination) this);
    } else {
        // current append logic
        ...
    }
}
{code}

Helper contains re-usable code for moving the text into a CharBuffer, and for 
repeatedly calling {{CharsetEncoder#encode}}. 
{code}
public class TextEncoderHelper {
    TextEncoderHelper(Charset charset) {}    

    void appendWithoutAllocation(LogEvent event, TextLayout textLayout, 
BinaryDestination destination) {
        StringBuilder sb = textLayout.toText(event, getCachedStringBuilder());
        
        ByteBuffer byteBuf = destination.getByteBuffer();
        CharBuffer charBuf = getCachedCharBuffer();
        charBuf.reset();
        int start = 0;
        int todoChars = sb.length();
        do {
            int copied = copy(sb, start, charBuf);
            start += copied;
            todoChars -= copied;
            boolean endOfInput = todoChars <= 0;
    
            charBuf.flip();
            CodeResult result;
            do {
                result = charsetEncoder.encode(charBuf, byteBuf, endOfInput);
                if (result == CodeResult.OVERFLOW) { // byteBuf full
                    destination.drain(byteBuf); // consumes contents and clears 
the byte buffer
                }
            } while (result == CodeResult.OVERFLOW);
        } while (!endOfInput);
    }
    
    /**
     * Copies characters from the StringBuilder into the CharBuffer,
     * starting at the specified offset and ending when either all
     * characters have been copied or when the CharBuffer is full.
     *
     * @return the number of characters that were copied
     */
    int copy(StringBuilder source, int offset, CharBuffer destination) {}
}
{code}


  was:
*Problem*
The current Layout API does not make it easy for implementors to avoid creating 
temporary objects. Especially these methods:
{code}
byte[] toByteArray(LogEvent);
T toSerializable(LogEvent);
{code}

The byte array returned from {{toByteArray(LogEvent)}} cannot be re-used 
between log events, since the caller cannot know how many bytes a partially 
filled array contains.

In practice, all Layout implementations in Log4j 2 except SerializedLayout 
implement the {{StringLayout}} subinterface. This means that the 
{{toSerializable()}} method needs to return a new String object for every log 
event.

*Forces*
I am interested in reducing or even eliminating the allocation of temporary 
objects for text-based layouts. Many of these use (and re-use) a StringBuilder 
to build a text representation of the current log event. Once this text 
representation is built, it needs to be converted to bytes that the Appender 
can consume. I am aware of two ways in the JDK to convert text to bytes:
* the various {{String#getBytes}} methods - these all allocate a new byte array 
for each invocation
* the underlying {{java.nio.charset.CharsetEncoder}} used internally by String 
- especially method {{CoderResult encode(CharBuffer in, ByteBuffer out, boolean 
endOfInput)}} which converts characters to bytes without object allocation.

The last method is interesting because this gives us an opportunity to also 
reduce the amount of copying by directly supplying the ByteBuffer buffer used 
by RandomAccessFileAppender, or the MappedByteBuffer of the 
MemoryMappedFileAppender.

The resulting API needs to support the fact that implementations may need to 
call {{CharsetEncoder#encode}} multiple times:
* The ByteBuffer may not have enough remaining space to hold all the data; 
{{CharsetEncoder#encode}} returns {{CoderResult.OVERFLOW}} to signal this so 
the caller can consume the contents and reset/clear the buffer before 
continuing.
* The CharBuffer may not be large enough to hold the full text representation 
of the log event. Again, {{CharsetEncoder#encode}} may need to be invoked 
multiple times.

*Proposal*
(Thinking out loud here, I'm open to suggestions.)
It may be sufficient for the layout interface to have a single method where 
callers can specify the destination StringBuilder.
{code}
public interface TextLayout {
    /**
     * Creates a text representation of the specified log event
     * and writes it into the specified StringBuilder.
     * <p>
     * Implementations are free to return a new StringBuilder if they can
     * detect in advance that the specified StringBuilder is too small.
     */
    StringBuilder toText(LogEvent e, StringBuilder destination);
}
{code}

Appenders that use ByteBuffers can implement this interface:
{code}
public interface BinaryDestination {
    ByteBuffer getByteBuffer();
    void drain(ByteBuffer buf);
}
{code}

Then Appender code can look like this, delegating all the work to a helper:
{code}
// appender
public void append(final LogEvent event) {
    if (getLayout() instanceof TextLayout) {
        TextEncoderHelper helper = ... // get cached helper
        helper.appendWithoutAllocation(event, (TextLayout) getLayout(), 
(BinaryDestination) this);
    } else {
        // current append logic
        ...
    }
}
{code}

Helper contains re-usable code for moving the text into a CharBuffer, and for 
repeatedly calling {{CharsetEncoder#encode}}. 
{code}
public class TextEncoderHelper {
    TextEncoderHelper(Charset charset) {}    

    void appendWithoutAllocation(LogEvent event, TextLayout textLayout, 
BinaryDestination destination) {
        StringBuilder sb = textLayout.toText(event, getCachedStringBuilder());
        
        ByteBuffer byteBuf = destination.getByteBuffer();
        CharBuffer charBuf = getCachedCharBuffer();
        charBuf.reset();
        int start = 0;
        int todoChars = sb.length();
        do {
            int copied = copy(sb, start, charBuf);
            start += copied;
            todoChars -= copied;
            boolean endOfInput = todoChars <= 0;
    
            charBuf.flip();
            CodeResult result;
            do {
                result = charsetEncoder.encode(charBuf, byteBuf, endOfInput);
                if (result == CodeResult.OVERFLOW) { // byteBuf full
                    destination.drain(byteBuf); // resets the byte buffer
                }
            } while (result == CodeResult.OVERFLOW);
        } while (!endOfInput);
    }
    
    /**
     * Copies characters from the StringBuilder into the CharBuffer,
     * starting at the specified offset and ending when either all
     * characters have been copied or when the CharBuffer is full.
     *
     * @return the number of characters that were copied
     */
    int copy(StringBuilder source, int offset, CharBuffer destination) {}
}
{code}



> Layout improvements to enable avoiding temporary object allocation
> ------------------------------------------------------------------
>
>                 Key: LOG4J2-1274
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-1274
>             Project: Log4j 2
>          Issue Type: Improvement
>          Components: Layouts
>    Affects Versions: 2.5
>            Reporter: Remko Popma
>
> *Problem*
> The current Layout API does not make it easy for implementors to avoid 
> creating temporary objects. Especially these methods:
> {code}
> byte[] toByteArray(LogEvent);
> T toSerializable(LogEvent);
> {code}
> The byte array returned from {{toByteArray(LogEvent)}} cannot be re-used 
> between log events, since the caller cannot know how many bytes a partially 
> filled array contains.
> In practice, all Layout implementations in Log4j 2 except SerializedLayout 
> implement the {{StringLayout}} subinterface. This means that the 
> {{toSerializable()}} method needs to return a new String object for every log 
> event.
> *Forces*
> I am interested in reducing or even eliminating the allocation of temporary 
> objects for text-based layouts. Many of these use (and re-use) a 
> StringBuilder to build a text representation of the current log event. Once 
> this text representation is built, it needs to be converted to bytes that the 
> Appender can consume. I am aware of two ways in the JDK to convert text to 
> bytes:
> * the various {{String#getBytes}} methods - these all allocate a new byte 
> array for each invocation
> * the underlying {{java.nio.charset.CharsetEncoder}} used internally by 
> String - especially method {{CoderResult encode(CharBuffer in, ByteBuffer 
> out, boolean endOfInput)}} which converts characters to bytes without object 
> allocation.
> The last method is interesting because this gives us an opportunity to also 
> reduce the amount of copying by directly supplying the ByteBuffer buffer used 
> by RandomAccessFileAppender, or the MappedByteBuffer of the 
> MemoryMappedFileAppender.
> The resulting API needs to support the fact that implementations may need to 
> call {{CharsetEncoder#encode}} multiple times:
> * The ByteBuffer may not have enough remaining space to hold all the data; 
> {{CharsetEncoder#encode}} returns {{CoderResult.OVERFLOW}} to signal this so 
> the caller can consume the contents and reset/clear the buffer before 
> continuing.
> * The CharBuffer may not be large enough to hold the full text representation 
> of the log event. Again, {{CharsetEncoder#encode}} may need to be invoked 
> multiple times.
> *Proposal*
> (Thinking out loud here, I'm open to suggestions.)
> It may be sufficient for the layout interface to have a single method where 
> callers can specify the destination StringBuilder.
> {code}
> public interface TextLayout {
>     /**
>      * Creates a text representation of the specified log event
>      * and writes it into the specified StringBuilder.
>      * <p>
>      * Implementations are free to return a new StringBuilder if they can
>      * detect in advance that the specified StringBuilder is too small.
>      */
>     StringBuilder toText(LogEvent e, StringBuilder destination);
> }
> {code}
> Appenders that use ByteBuffers can implement this interface:
> {code}
> public interface BinaryDestination {
>     ByteBuffer getByteBuffer();
>     /**
>      * Consumes the buffer content and clear()s it.
>      */
>     void drain(ByteBuffer buf);
> }
> {code}
> Then Appender code can look like this, delegating all the work to a helper:
> {code}
> // appender
> public void append(final LogEvent event) {
>     if (getLayout() instanceof TextLayout) {
>         TextEncoderHelper helper = ... // get cached helper
>         helper.appendWithoutAllocation(event, (TextLayout) getLayout(), 
> (BinaryDestination) this);
>     } else {
>         // current append logic
>         ...
>     }
> }
> {code}
> Helper contains re-usable code for moving the text into a CharBuffer, and for 
> repeatedly calling {{CharsetEncoder#encode}}. 
> {code}
> public class TextEncoderHelper {
>     TextEncoderHelper(Charset charset) {}    
>     void appendWithoutAllocation(LogEvent event, TextLayout textLayout, 
> BinaryDestination destination) {
>         StringBuilder sb = textLayout.toText(event, getCachedStringBuilder());
>         
>         ByteBuffer byteBuf = destination.getByteBuffer();
>         CharBuffer charBuf = getCachedCharBuffer();
>         charBuf.reset();
>         int start = 0;
>         int todoChars = sb.length();
>         do {
>             int copied = copy(sb, start, charBuf);
>             start += copied;
>             todoChars -= copied;
>             boolean endOfInput = todoChars <= 0;
>     
>             charBuf.flip();
>             CodeResult result;
>             do {
>                 result = charsetEncoder.encode(charBuf, byteBuf, endOfInput);
>                 if (result == CodeResult.OVERFLOW) { // byteBuf full
>                     destination.drain(byteBuf); // consumes contents and 
> clears the byte buffer
>                 }
>             } while (result == CodeResult.OVERFLOW);
>         } while (!endOfInput);
>     }
>     
>     /**
>      * Copies characters from the StringBuilder into the CharBuffer,
>      * starting at the specified offset and ending when either all
>      * characters have been copied or when the CharBuffer is full.
>      *
>      * @return the number of characters that were copied
>      */
>     int copy(StringBuilder source, int offset, CharBuffer destination) {}
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to