Hi Corey, thanks for proposing this change. I have comments and suggestions regarding various files.
Base64.java This is the only file which needs another review from core-libs-dev. First of all, I like the idea to use a HotSpotIntrinsicCandidate which can consume as many bytes as the implementation wants. Comment before decodeBlock: Let's be precise: "should process a multiple of four" => "must process a multiple of four" > If any illegal base64 bytes are encountered in the source by the > intrinsic, the intrinsic can return a data length of zero or any > number of bytes before the place where the illegal base64 byte > was encountered. I think this has a drawback. Somebody may use a debugger and want to stop when throwing IllegalArgumentException. He should see the position which matches the Java implementation. Please note that the comment indentation differs from other comments. decode0: Final "else" after return is redundant. stubGenerator_ppc.cpp "__vector" breaks AIX build! Does it work on Big Endian linux with old gcc (we require 7.3.1, now)? Please either support Big Endian properly or #ifdef it out. What exactly does it on linux? I remember that we had tried such prefixes but were not satisfied. I think it didn't enforce 16 Byte alignment if I remember correctly. Attention: C2 does no longer convert int/bool to 64 bit values (since JDK-8086069). So the argument registers for offset, length and isURL may contain garbage in the higher bits. You may want to use load_const_optimized which produces shorter code. You may want to use __ align(32) to align unrolled_loop_start. I'll review the algorithm in detail when I find more time. assembler_ppc.hpp assembler_ppc.inline.hpp vm_version_ppc.cpp vm_version_ppc.hpp Please rebase. Parts of the change were pushed as part of 8248190: Enable Power10 system and implement new byte-reverse instructions vmSymbols.hpp Indentation looks odd at the end. library_call.cpp Good. Indentation style of the call parameters differs from encodeBlock. runtime.cpp Good. aotCodeHeap.cpp vmSymbols.cpp shenandoahSupport.cpp vmStructs_jvmci.cpp shenandoahSupport.cpp escape.cpp runtime.hpp stubRoutines.cpp stubRoutines.hpp vmStructs.cpp Good and trivial. Tests: I think we should have JTREG tests to check for regressions in the future. Best regards, Martin > -----Original Message----- > From: Corey Ashford <cjash...@linux.ibm.com> > Sent: Mittwoch, 19. August 2020 20:11 > To: Michihiro Horie <ho...@jp.ibm.com> > Cc: hotspot-compiler-...@openjdk.java.net; core-libs-dev <core-libs- > d...@openjdk.java.net>; Kazunori Ogata <oga...@jp.ibm.com>; > jos...@br.ibm.com; Doerr, Martin <martin.do...@sap.com> > Subject: Re: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and > API for Base64 decoding > > Michihiro Horie posted up a new iteration of this webrev for me. This > time the webrev includes a complete implementation of the intrinsic for > Power9 and Power10. > > You can find it here: > http://cr.openjdk.java.net/~mhorie/8248188/webrev.02/ > > Changes in webrev.02 vs. webrev.01: > > * The method header for the intrinsic in the Base64 code has been > rewritten using the Javadoc style. The clarity of the comments has been > improved and some verbosity has been removed. There are no additional > functional changes to Base64.java. > > * The code needed to martial and check the intrinsic parameters has > been added, using the base64 encodeBlock intrinsic as a guideline. > > * A complete intrinsic implementation for Power9 and Power10 is included. > > * Adds some Power9 and Power10 assembler instructions needed by the > intrinsic which hadn't been defined before. > > The intrinsic implementation in this patch accelerates the decoding of > large blocks of base64 data by a factor of about 3.5X on Power9. > > I'm attaching two Java test cases I am using for testing and > benchmarking. The TestBase64_VB encodes and decodes randomly-sized > buffers of random data and checks that original data matches the > encoded-then-decoded data. TestBase64Errors encodes a 48K block of > random bytes, then corrupts each byte of the encoded data, one at a > time, checking to see if the decoder catches the illegal byte. > > Any comments/suggestions would be appreciated. > > Thanks, > > - Corey > > On 7/27/20 6:49 PM, Corey Ashford wrote: > > Michihiro Horie uploaded a new revision of the Base64 decodeBlock > > intrinsic API for me: > > > > http://cr.openjdk.java.net/~mhorie/8248188/webrev.01/ > > > > It has the following changes with respect to the original one posted: > > > > * In the event of encountering a non-base64 character, instead of > > having a separate error code of -1, the intrinsic can now just return > > either 0, or the number of data bytes produced up to the point where the > > illegal base64 character was encountered. This reduces the number of > > special cases, and also provides a way to speed up the process of > > finding the bad character by the slower, pure-Java algorithm. > > > > * The isMIME boolean is removed from the API for two reasons: > > - The current API is not sufficient to handle the isMIME case, > > because there isn't a strict relationship between the number of input > > bytes and the number of output bytes, because there can be an arbitrary > > number of non-base64 characters in the source. > > - If an intrinsic only implements the (isMIME == false) case as ours > > does, it will always return 0 bytes processed, which will slightly slow > > down the normal path of processing an (isMIME == true) instantiation. > > - We considered adding a separate hotspot candidate for the (isMIME > > == true) case, but since we don't have an intrinsic implementation to > > test that, we decided to leave it as a future optimization. > > > > Comments and suggestions are welcome. Thanks for your consideration. > > > > - Corey > > > > On 6/23/20 6:23 PM, Michihiro Horie wrote: > >> Hi Corey, > >> > >> Following is the issue I created. > >> https://bugs.openjdk.java.net/browse/JDK-8248188 > >> > >> I will upload a webrev when you're ready as we talked in private. > >> > >> Best regards, > >> Michihiro > >> > >> Inactive hide details for "Corey Ashford" ---2020/06/24 > >> 09:40:10---Currently in java.util.Base64, there is a > >> HotSpotIntrinsicCa"Corey Ashford" ---2020/06/24 09:40:10---Currently > >> in java.util.Base64, there is a HotSpotIntrinsicCandidate and API for > >> encodeBlock, but no > >> > >> From: "Corey Ashford" <cjash...@linux.ibm.com> > >> To: "hotspot-compiler-...@openjdk.java.net" > >> <hotspot-compiler-...@openjdk.java.net>, > >> "ppc-aix-port-...@openjdk.java.net" <ppc-aix-port- > d...@openjdk.java.net> > >> Cc: Michihiro Horie/Japan/IBM@IBMJP, Kazunori > Ogata/Japan/IBM@IBMJP, > >> jos...@br.ibm.com > >> Date: 2020/06/24 09:40 > >> Subject: RFR(S): [PATCH] Add HotSpotIntrinsicCandidate and API for > >> Base64 decoding > >> > >> ------------------------------------------------------------------------ > >> > >> > >> > >> Currently in java.util.Base64, there is a HotSpotIntrinsicCandidate and > >> API for encodeBlock, but none for decoding. This means that only > >> encoding gets acceleration from the underlying CPU's vector hardware. > >> > >> I'd like to propose adding a new intrinsic for decodeBlock. The > >> considerations I have for this new intrinsic's API: > >> > >> * Don't make any assumptions about the underlying capability of the > >> hardware. For example, do not impose any specific block size > >> granularity. > >> > >> * Don't assume the underlying intrinsic can handle isMIME or isURL > >> modes, but also let them decide if they will process the data regardless > >> of the settings of the two booleans. > >> > >> * Any remaining data that is not processed by the intrinsic will be > >> processed by the pure Java implementation. This allows the intrinsic to > >> process whatever block sizes it's good at without the complexity of > >> handling the end fragments. > >> > >> * If any illegal character is discovered in the decoding process, the > >> intrinsic will simply return -1, instead of requiring it to throw a > >> proper exception from the context of the intrinsic. In the event of > >> getting a -1 returned from the intrinsic, the Java Base64 library code > >> simply calls the pure Java implementation to have it find the error and > >> properly throw an exception. This is a performance trade-off in the > >> case of an error (which I expect to be very rare). > >> > >> * One thought I have for a further optimization (not implemented in > >> the current patch), is that when the intrinsic decides not to process a > >> block because of some combination of isURL and isMIME settings it > >> doesn't handle, it could return extra bits in the return code, encoded > >> as a negative number. For example: > >> > >> Illegal_Base64_char = 0b001; > >> isMIME_unsupported = 0b010; > >> isURL_unsupported = 0b100; > >> > >> These can be OR'd together as needed and then negated (flip the sign). > >> The Base64 library code could then cache these flags, so it will know > >> not to call the intrinsic again when another decodeBlock is requested > >> but with an unsupported mode. This will save the performance hit of > >> calling the intrinsic when it is guaranteed to fail. > >> > >> I've tested the attached patch with an actual intrinsic coded up for > >> Power9/Power10, but those runtime intrinsics and arch-specific patches > >> aren't attached today. I want to get some consensus on the > >> library-level intrinsic API first. > >> > >> Also attached is a simple test case to test that the new intrinsic API > >> doesn't break anything. > >> > >> I'm open to any comments about this. > >> > >> Thanks for your consideration, > >> > >> - Corey > >> > >> > >> Corey Ashford > >> IBM Systems, Linux Technology Center, OpenJDK team > >> cjashfor at us dot ibm dot com > >> [attachment "decodeBlock_api-20200623.patch" deleted by Michihiro > >> Horie/Japan/IBM] [attachment "TestBase64.java" deleted by Michihiro > >> Horie/Japan/IBM] > >> > >> > >