Hi Sherman,

I found updates in http://cr.openjdk.java.net/~sherman/8164278/webrev.04

That looks fine.

Typo:  line 1008: "neve"


On 2/7/2018 10:32 PM, Xueming Shen wrote:
Hi Roger,

Given the concern of the possible incompatible behavior change of over reading bytes from the underlying stream. I decided to give up last proposed changes for DecInputStream for now. With some "minor" cleanup and tuning I still have about 10%+ improvement with various input size sampling. Let's address the decoder stream in separate rfe later, if desired.

The updated webrev and the jmh results are at


Last version of webrev and corresponding jmh result can be found at


On 2/5/18, 6:00 PM, Xueming Shen wrote:

Please help review the change for  JDK-8164278.

issue: https://bugs.openjdk.java.net/browse/JDK-8164278
webrev: http://cr.openjdk.java.net/~sherman/8164278/webrev

jmh.src: http://cr.openjdk.java.net/~sherman/8164278/Base64BM.java
jmh.result: http://cr.openjdk.java.net/~sherman/8164278/base64.bm

    Adopted the "similar" optimization approach we took in Base64.Encoder.encode0()     to add a "fast path" to decode a block of 4-byte units together (current implementation     decodes one single byte per while/loop. The jmh benchmark result indicates a big speed     boost  (those decodeArray/Mime/Url results, from 30% to 2 times faster, depends on
    input size).

    It appears encode0() was fully optimized in 1.8. Can't get it faster :-) Tried to use     Unsafe.getLong/putLong instead of byte by byte access. But it appears the 8-byte     "vectorization" does not bring us enough speed up, the performance is the same as the
    current one. See encode00() at

    If my memory serves me right, the current implementation was under the assumption that     the underlying output stream probably is buffered (if invoker cares). It would be a redundant     if EncOutputStream buffers bytes again. It appears this is a wrong assumption. It is much     slower to write 4 bytes separately, compared to bundle them together in a byte[4] and write     into underlying, even the underlying output stream is a ByteArrayOutputStream.     Again, the proposed change is to add a fast path loop, as we do in encode0(), to decode/     write a block of 3-byte->4-byte unites. It appears this fast loop can help the compiler to
    optimize away some boundary checks, therefor is much faster.
    The jmh result Base64BM.encodeOS suggests the new implementation is almost 4 times faster
    and is almost the same as java.mail's stream encoder.

    Same as the approach we take for decode0(), to add a fast path decode block of 4-byte unites
    The jmh result Base64BM.decodeOS (the name probably should be decodeIS, for InputStream,     but anyway...) shows the proposed one is 4 times faster than the existing impl and double
    the  java.mail (Base64BM.decodeOS_javamail) implementation.

    However, there is a side-effect of adding a buffering mechanism into DecInputStream. The     current implementation read bytes from the underlying stream one by one, it never reads     more bytes than it needs, which means it should/is supposed to just stop at the last byte
    that it needs to decode, when there is "=" present in the stream.
    With buffering, it's possible more bytes (after "=", which indicates "end of base64 stream") might     be read/consumed in and buffered.  A concern? if this is indeed a concern, the only alternative     might be to add a separate method to support this "faster-buffered-decoder"?


Reply via email to