[ 
https://issues.apache.org/jira/browse/CODEC-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Herbert resolved CODEC-301.
--------------------------------
    Fix Version/s: 1.16
       Resolution: Fixed

In git master.

Commit: 6b67d6f093a821ede0b393f260b407d035289e07

> BaseNCodec: Reduce byte[] allocations by reusing buffers
> --------------------------------------------------------
>
>                 Key: CODEC-301
>                 URL: https://issues.apache.org/jira/browse/CODEC-301
>             Project: Commons Codec
>          Issue Type: Improvement
>    Affects Versions: 1.15
>            Reporter: Alex Herbert
>            Priority: Minor
>             Fix For: 1.16
>
>
> BaseNCodec will encode or decode the input bytes into a byte[] buffer stored 
> in a Context. The buffers are constantly reallocated when using the codecs 
> via a BaseNCodecInputStream. 
> The Context buffer is set to null to signal no more bytes are available. This 
> requires reallocation for the next chunk of input from the stream. The 
> underlying stream is also read using a single use byte[] allocated inside the 
> read loop:
> {code:java}
>         while (readLen == 0) {
>             if (!baseNCodec.hasData(context)) {
>                 // *****
>                 // This should be allocated once!
>                 // *****
>                 final byte[] buf = new byte[doEncode ? 4096 : 8192];
>                 final int c = in.read(buf);
>                 if (doEncode) {
>                     baseNCodec.encode(buf, 0, c, context);
>                 } else {
>                     baseNCodec.decode(buf, 0, c, context);
>                 }
>             }
>             readLen = baseNCodec.readResults(array, offset, len, context);
>         }
> {code}
> The code can be changed to hold a single buffer to read the underlying input 
> stream at the class level. Changes can be made to BaseNCodec to not set the 
> Context buffer to null as a signal. It can then be reused by the 
> BaseNCodecInputStream. This requires updating the check for available bytes 
> to use the position markers in BaseNCodec, for example (old code commented 
> out for reference):
> {code:java}
> /    int available(final Context context) {  // package protected for access 
> from I/O streams
>         return hasData(context) ? context.pos - context.readPos : 0;
>         //return context.buffer != null ? context.pos - context.readPos : 0;
>     }
>     boolean hasData(final Context context) {  // package protected for access 
> from I/O streams
>         return context.pos > context.readPos;
>         //return context.buffer != null;
>     }
>     int readResults(final byte[] b, final int bPos, final int bAvail, final 
> Context context) {
>         if (hasData(context)) {
>         //if (context.buffer != null) {
>             final int len = Math.min(available(context), bAvail);
>             System.arraycopy(context.buffer, context.readPos, b, bPos, len);
>             context.readPos += len;
>             if (context.readPos >= context.pos) {
>                 // All data read.
>                 // Reset markers so hasData() will return false, and this 
> method can return -1,
>                 // but do not set buffer to null to allow reuse.
>                 context.pos = context.readPos = 0;
>             //    context.buffer = null; // so hasData() will return false, 
> and this method can return -1
>             }
>             return len;
>         }
>         return context.eof ? EOF : 0;
>     }
> {code}
> This change was suggested by Alexander Pinske.
> The reuse of byte buffers reduces byte[] allocations from 280MB to <4MB when 
> reading a 133MB base64 stream.
> Measured with JFR, see [https://github.com/apinske/playground-io].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to