RE: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding

Doerr, Martin Tue, 25 Aug 2020 06:38:58 -0700

Hi Corey,

thanks for proposing this change. I have comments and suggestions regarding 
various files.



Base64.java

This is the only file which needs another review from core-libs-dev.
First of all, I like the idea to use a HotSpotIntrinsicCandidate which can 
consume as many bytes as the implementation wants.

Comment before decodeBlock:
Let's be precise: "should process a multiple of four" => "must process a 
multiple of four"

> If any illegal base64 bytes are encountered in the source by the
> intrinsic, the intrinsic can return a data length of zero or any
> number of bytes before the place where the illegal base64 byte
> was encountered.
I think this has a drawback. Somebody may use a debugger and want to stop when 
throwing IllegalArgumentException. He should see the position which matches the 
Java implementation.

Please note that the comment indentation differs from other comments.

decode0: Final "else" after return is redundant.


stubGenerator_ppc.cpp

"__vector" breaks AIX build!
Does it work on Big Endian linux with old gcc (we require 7.3.1, now)?
Please either support Big Endian properly or #ifdef it out.
What exactly does it on linux?
I remember that we had tried such prefixes but were not satisfied. I think it 
didn't enforce 16 Byte alignment if I remember correctly.

Attention: C2 does no longer convert int/bool to 64 bit values (since 
JDK-8086069). So the argument registers for offset, length and isURL may 
contain garbage in the higher bits.

You may want to use load_const_optimized which produces shorter code.

You may want to use __ align(32) to align unrolled_loop_start.

I'll review the algorithm in detail when I find more time.


assembler_ppc.hpp
assembler_ppc.inline.hpp
vm_version_ppc.cpp
vm_version_ppc.hpp
Please rebase. Parts of the change were pushed as part of 8248190: Enable 
Power10 system and implement new byte-reverse instructions


vmSymbols.hpp
Indentation looks odd at the end.


library_call.cpp
Good. Indentation style of the call parameters differs from encodeBlock.


runtime.cpp
Good.


aotCodeHeap.cpp
vmSymbols.cpp
shenandoahSupport.cpp
vmStructs_jvmci.cpp
shenandoahSupport.cpp
escape.cpp
runtime.hpp
stubRoutines.cpp
stubRoutines.hpp
vmStructs.cpp
Good and trivial.


Tests:
I think we should have JTREG tests to check for regressions in the future.

Best regards,
Martin


> -----Original Message-----
> From: Corey Ashford <cjash...@linux.ibm.com>
> Sent: Mittwoch, 19. August 2020 20:11
> To: Michihiro Horie <ho...@jp.ibm.com>
> Cc: hotspot-compiler-...@openjdk.java.net; core-libs-dev <core-libs-
> d...@openjdk.java.net>; Kazunori Ogata <oga...@jp.ibm.com>;
> jos...@br.ibm.com; Doerr, Martin <martin.do...@sap.com>
> Subject: Re: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and
> API for Base64 decoding
> 
> Michihiro Horie posted up a new iteration of this webrev for me.  This
> time the webrev includes a complete implementation of the intrinsic for
> Power9 and Power10.
> 
> You can find it here:
> http://cr.openjdk.java.net/~mhorie/8248188/webrev.02/
> 
> Changes in webrev.02 vs. webrev.01:
> 
>    * The method header for the intrinsic in the Base64 code has been
> rewritten using the Javadoc style.  The clarity of the comments has been
> improved and some verbosity has been removed.  There are no additional
> functional changes to Base64.java.
> 
>    * The code needed to martial and check the intrinsic parameters has
> been added, using the base64 encodeBlock intrinsic as a guideline.
> 
>    * A complete intrinsic implementation for Power9 and Power10 is included.
> 
>    * Adds some Power9 and Power10 assembler instructions needed by the
> intrinsic which hadn't been defined before.
> 
> The intrinsic implementation in this patch accelerates the decoding of
> large blocks of base64 data by a factor of about 3.5X on Power9.
> 
> I'm attaching two Java test cases I am using for testing and
> benchmarking.  The TestBase64_VB encodes and decodes randomly-sized
> buffers of random data and checks that original data matches the
> encoded-then-decoded data.  TestBase64Errors encodes a 48K block of
> random bytes, then corrupts each byte of the encoded data, one at a
> time, checking to see if the decoder catches the illegal byte.
> 
> Any comments/suggestions would be appreciated.
> 
> Thanks,
> 
> - Corey
> 
> On 7/27/20 6:49 PM, Corey Ashford wrote:
> > Michihiro Horie uploaded a new revision of the Base64 decodeBlock
> > intrinsic API for me:
> >
> > http://cr.openjdk.java.net/~mhorie/8248188/webrev.01/
> >
> > It has the following changes with respect to the original one posted:
> >
> >   * In the event of encountering a non-base64 character, instead of
> > having a separate error code of -1, the intrinsic can now just return
> > either 0, or the number of data bytes produced up to the point where the
> > illegal base64 character was encountered.  This reduces the number of
> > special cases, and also provides a way to speed up the process of
> > finding the bad character by the slower, pure-Java algorithm.
> >
> >   * The isMIME boolean is removed from the API for two reasons:
> >     - The current API is not sufficient to handle the isMIME case,
> > because there isn't a strict relationship between the number of input
> > bytes and the number of output bytes, because there can be an arbitrary
> > number of non-base64 characters in the source.
> >     - If an intrinsic only implements the (isMIME == false) case as ours
> > does, it will always return 0 bytes processed, which will slightly slow
> > down the normal path of processing an (isMIME == true) instantiation.
> >     - We considered adding a separate hotspot candidate for the (isMIME
> > == true) case, but since we don't have an intrinsic implementation to
> > test that, we decided to leave it as a future optimization.
> >
> > Comments and suggestions are welcome.  Thanks for your consideration.
> >
> > - Corey
> >
> > On 6/23/20 6:23 PM, Michihiro Horie wrote:
> >> Hi Corey,
> >>
> >> Following is the issue I created.
> >> https://bugs.openjdk.java.net/browse/JDK-8248188
> >>
> >> I will upload a webrev when you're ready as we talked in private.
> >>
> >> Best regards,
> >> Michihiro
> >>
> >> Inactive hide details for "Corey Ashford" ---2020/06/24
> >> 09:40:10---Currently in java.util.Base64, there is a
> >> HotSpotIntrinsicCa"Corey Ashford" ---2020/06/24 09:40:10---Currently
> >> in java.util.Base64, there is a HotSpotIntrinsicCandidate and API for
> >> encodeBlock, but no
> >>
> >> From: "Corey Ashford" <cjash...@linux.ibm.com>
> >> To: "hotspot-compiler-...@openjdk.java.net"
> >> <hotspot-compiler-...@openjdk.java.net>,
> >> "ppc-aix-port-...@openjdk.java.net" <ppc-aix-port-
> d...@openjdk.java.net>
> >> Cc: Michihiro Horie/Japan/IBM@IBMJP, Kazunori
> Ogata/Japan/IBM@IBMJP,
> >> jos...@br.ibm.com
> >> Date: 2020/06/24 09:40
> >> Subject: RFR(S): [PATCH] Add HotSpotIntrinsicCandidate and API for
> >> Base64 decoding
> >>
> >> ------------------------------------------------------------------------
> >>
> >>
> >>
> >> Currently in java.util.Base64, there is a HotSpotIntrinsicCandidate and
> >> API for encodeBlock, but none for decoding.  This means that only
> >> encoding gets acceleration from the underlying CPU's vector hardware.
> >>
> >> I'd like to propose adding a new intrinsic for decodeBlock.  The
> >> considerations I have for this new intrinsic's API:
> >>
> >>   * Don't make any assumptions about the underlying capability of the
> >> hardware.  For example, do not impose any specific block size
> >> granularity.
> >>
> >>   * Don't assume the underlying intrinsic can handle isMIME or isURL
> >> modes, but also let them decide if they will process the data regardless
> >> of the settings of the two booleans.
> >>
> >>   * Any remaining data that is not processed by the intrinsic will be
> >> processed by the pure Java implementation.  This allows the intrinsic to
> >> process whatever block sizes it's good at without the complexity of
> >> handling the end fragments.
> >>
> >>   * If any illegal character is discovered in the decoding process, the
> >> intrinsic will simply return -1, instead of requiring it to throw a
> >> proper exception from the context of the intrinsic.  In the event of
> >> getting a -1 returned from the intrinsic, the Java Base64 library code
> >> simply calls the pure Java implementation to have it find the error and
> >> properly throw an exception.  This is a performance trade-off in the
> >> case of an error (which I expect to be very rare).
> >>
> >>   * One thought I have for a further optimization (not implemented in
> >> the current patch), is that when the intrinsic decides not to process a
> >> block because of some combination of isURL and isMIME settings it
> >> doesn't handle, it could return extra bits in the return code, encoded
> >> as a negative number.  For example:
> >>
> >> Illegal_Base64_char   = 0b001;
> >> isMIME_unsupported    = 0b010;
> >> isURL_unsupported     = 0b100;
> >>
> >> These can be OR'd together as needed and then negated (flip the sign).
> >> The Base64 library code could then cache these flags, so it will know
> >> not to call the intrinsic again when another decodeBlock is requested
> >> but with an unsupported mode.  This will save the performance hit of
> >> calling the intrinsic when it is guaranteed to fail.
> >>
> >> I've tested the attached patch with an actual intrinsic coded up for
> >> Power9/Power10, but those runtime intrinsics and arch-specific patches
> >> aren't attached today.  I want to get some consensus on the
> >> library-level intrinsic API first.
> >>
> >> Also attached is a simple test case to test that the new intrinsic API
> >> doesn't break anything.
> >>
> >> I'm open to any comments about this.
> >>
> >> Thanks for your consideration,
> >>
> >> - Corey
> >>
> >>
> >> Corey Ashford
> >> IBM Systems, Linux Technology Center, OpenJDK team
> >> cjashfor at us dot ibm dot com
> >> [attachment "decodeBlock_api-20200623.patch" deleted by Michihiro
> >> Horie/Japan/IBM] [attachment "TestBase64.java" deleted by Michihiro
> >> Horie/Japan/IBM]
> >>
> >>
> >

RE: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding

Reply via email to