[
https://issues.apache.org/jira/browse/CODEC-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Herbert resolved CODEC-301.
--------------------------------
Fix Version/s: 1.16
Resolution: Fixed
In git master.
Commit: 6b67d6f093a821ede0b393f260b407d035289e07
> BaseNCodec: Reduce byte[] allocations by reusing buffers
> --------------------------------------------------------
>
> Key: CODEC-301
> URL: https://issues.apache.org/jira/browse/CODEC-301
> Project: Commons Codec
> Issue Type: Improvement
> Affects Versions: 1.15
> Reporter: Alex Herbert
> Priority: Minor
> Fix For: 1.16
>
>
> BaseNCodec will encode or decode the input bytes into a byte[] buffer stored
> in a Context. The buffers are constantly reallocated when using the codecs
> via a BaseNCodecInputStream.
> The Context buffer is set to null to signal no more bytes are available. This
> requires reallocation for the next chunk of input from the stream. The
> underlying stream is also read using a single use byte[] allocated inside the
> read loop:
> {code:java}
> while (readLen == 0) {
> if (!baseNCodec.hasData(context)) {
> // *****
> // This should be allocated once!
> // *****
> final byte[] buf = new byte[doEncode ? 4096 : 8192];
> final int c = in.read(buf);
> if (doEncode) {
> baseNCodec.encode(buf, 0, c, context);
> } else {
> baseNCodec.decode(buf, 0, c, context);
> }
> }
> readLen = baseNCodec.readResults(array, offset, len, context);
> }
> {code}
> The code can be changed to hold a single buffer to read the underlying input
> stream at the class level. Changes can be made to BaseNCodec to not set the
> Context buffer to null as a signal. It can then be reused by the
> BaseNCodecInputStream. This requires updating the check for available bytes
> to use the position markers in BaseNCodec, for example (old code commented
> out for reference):
> {code:java}
> / int available(final Context context) { // package protected for access
> from I/O streams
> return hasData(context) ? context.pos - context.readPos : 0;
> //return context.buffer != null ? context.pos - context.readPos : 0;
> }
> boolean hasData(final Context context) { // package protected for access
> from I/O streams
> return context.pos > context.readPos;
> //return context.buffer != null;
> }
> int readResults(final byte[] b, final int bPos, final int bAvail, final
> Context context) {
> if (hasData(context)) {
> //if (context.buffer != null) {
> final int len = Math.min(available(context), bAvail);
> System.arraycopy(context.buffer, context.readPos, b, bPos, len);
> context.readPos += len;
> if (context.readPos >= context.pos) {
> // All data read.
> // Reset markers so hasData() will return false, and this
> method can return -1,
> // but do not set buffer to null to allow reuse.
> context.pos = context.readPos = 0;
> // context.buffer = null; // so hasData() will return false,
> and this method can return -1
> }
> return len;
> }
> return context.eof ? EOF : 0;
> }
> {code}
> This change was suggested by Alexander Pinske.
> The reuse of byte buffers reduces byte[] allocations from 280MB to <4MB when
> reading a 133MB base64 stream.
> Measured with JFR, see [https://github.com/apinske/playground-io].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)