Hi,

A couple of other thoughts.

Assuming that the performance improvement is significant and
the structure is being put in place to be able to inline on PPC,
we should also consider adding the hand coding the range comparisons for x64 and other ISAs.
I don't know whether or how much avoiding the memory indirection is worth
(vs complexity).

Though it targets only Latin1 (and ASCII), extending the approach to
toUpper and toLower methods might also yield useful performance improvements.

In the native function naming and comments, it may be useful to emphasis that it
only applies to Latin1 so it does not get mis-applied.

Also, would similar performance gain be seen by implementing the
getType(ch) method as an intrinsic?  It would hard code more literals
shared with the java layer but might bring an advantage to more cases.

Regards, Roger

On 11/29/2018 09:52 AM, Roger Riggs wrote:
Hi Michihiro,

I may have missed the earliest part of this thread.
Have you run performance tests on these functions to ensure no regression
on other platforms and to note the improvements for PPC.

There are a few JMH tests for upper and lower in the jmh-jdk-microbenchmarks repo. [1]
src/main/java/org/openjdk/bench/java/lang/StringUpperLower.java

It would be good to contribute a micro benchmark for the digits case.

Thanks, Roger

p.s. If you get an email bounced message from sending to core-libs-dev@openjdk.java.net
please subscribe to it or your emails may be quarantined.

[1] http://hg.openjdk.java.net/code-tools/jmh-jdk-microbenchmarks


On 11/29/2018 01:31 AM, Michihiro Horie wrote:

Vladimir,Roger,
Thank you so much for your responses.

Vladimir,
Regarding the hotspot change,I’m new in changing around library_call.cpp,so your comments are very helpful. I will fix my code,especially inline_character_compare()would be like:

+bool LibraryCallKit::inline_character_compare(vmIntrinsics::ID id){
+ RegionNode* old_rgn = control()->as_Region();
+ RegionNode* new_rgn = new RegionNode(1);
+ record_for_igvn(new_rgn);
+
+ // Swap current RegionNode with new one
+ Node* new_ctrl = old_rgn->in(1);
+ old_rgn->del_req(1);
+ new_rgn->add_req(new_ctrl);
+
+ set_control(_gvn.transform(new_rgn));
+
+ // argument(0)is receiver
+ Node* codePoint = argument(1);
+ Node* n = NULL;
+
+ switch (id){
+ case vmIntrinsics::_isDigit_c : n = new DigitCNode(control(),codePoint);break; + case vmIntrinsics::_isLowerCase_c : n = new LowerCaseCNode(control(),codePoint);break; + case vmIntrinsics::_isUpperCase_c : n = new UpperCaseCNode(control(),codePoint);break; + case vmIntrinsics::_isWhitespace_c : n = new WhitespaceCNode(control(),codePoint);break;
+ default:
+ fatal_unexpected_iid(id);
+ }
+
+ set_result(_gvn.transform(n));
+ return true;
+}

Microbenchmark showed ~40% improvements.

Roger,
I would wait foryour review,thanks. I understood the discussion had not been recognized from people in core-libs-dev,and checked it was not actually archived there….

Best regards,
--
Michihiro,
IBM Research - Tokyo

Inactive hide details for Roger Riggs ---2018/11/29 11:21:26---Hi, I'll look at it on Thursday.Roger Riggs ---2018/11/29 11:21:26---Hi, I'll look at it on Thursday.

From: Roger Riggs <roger.ri...@oracle.com>
To: Vladimir Kozlov <vladimir.koz...@oracle.com>, Michihiro Horie <ho...@jp.ibm.com>, core-libs-dev@openjdk.java.net Cc: volker.simo...@sap.com, hotspot-compiler-...@openjdk.java.net, martin.do...@sap.com
Date: 2018/11/29 11:21
Subject: Re: RFR: 8213754: PPC64: Add Intrinsics for isDigit/isLowerCase/isUpperCase/isWhitespace

------------------------------------------------------------------------



Hi,

I'll look at it on Thursday.

In spite of it looking like the email was sent to core-lib-dev, I have
not seen it before
and its not in the core-libs-dev archive.

$.02, Roger

On 11/28/18 7:15 PM, Vladimir Kozlov wrote:
> You still need review from core-libs.
>
> About your hotspot changes. You need to add intrinsics to
> is_disabled_by_flags() to be able switch them off.
>
> Note, match_rule_supported() calls in
> C2Compiler::is_intrinsic_supported() executed before intrinsics in C2
> are registered. So they will not be available if they are not
> supported. match_rule_supported() checks in
> LibraryCallKit::inline_character_compare() are not needed.
>
> I don't get what you code in
> LibraryCallKit::inline_character_compare() is doing. Why you check
> Region node at the beginning and why you add Region and Phi at the end
> if you have only one path?
> You are creating code for whole method which should return boolean result
>
> +    boolean isDigit(int ch) {
> +      return getType(ch) == Character.DECIMAL_DIGIT_NUMBER;
> +    }
>
> but your code produce TypeInt::CHAR. why?
>
> An finally. Did you compare code generated by default and with you
> changes? what improvement you see?
>
> Thanks,
> Vladimir
>
> On 11/28/18 3:07 PM, Michihiro Horie wrote:
>> Is there any response from core-libs-dev?
>>
>> Vladimir,
>> Could you give your suggestion on how to proceed?
>>
>>
>> Best regards,
>> --
>> Michihiro,
>> IBM Research - Tokyo
>>
>>
>> ----- Original message -----
>> From: "Michihiro Horie" <ho...@jp.ibm.com>
>> Sent by: "hotspot-compiler-dev"
>> <hotspot-compiler-dev-boun...@openjdk.java.net>
>> To: "Doerr, Martin" <martin.do...@sap.com>
>> Cc: "Simonis, Volker" <volker.simo...@sap.com>,
>> "hotspot-compiler-...@openjdk.java.net"<hotspot-compiler-...@openjdk.java.net>,
>> "core-libs-dev@openjdk.java.net" <core-libs-dev@openjdk.java.net>
>> Subject: RE: 8213754: PPC64: Add Intrinsics for
>> isDigit/isLowerCase/isUpperCase/isWhitespace
>> Date: Thu, Nov 22, 2018 11:28 AM
>>
>> Hi Martin,
>>
>> Yes, we should wait for the feedback on class library change.
>>
>>  >Btw. I think you can further simplify the code in library_call.cpp
>> (no phi). But we can discuss details when thedirection regarding the
>> java classes is clear.
>> Thank you for pointing out it, I think I understand how to simplify
>> code.
>>
>>  >Hi Michi,
>>  >
>>  >On 11/20/2018 02:52 PM, Michihiro Horie wrote:
>>  >> >Please note that we don’t have a machine, yet. So other people
>> will have to test.
>>  >> I think Gustavo can help testing this change when its' ready.
>>  >Sure :)
>>  >
>>  >Best regards,
>>  >Gustavo
>> Thank you, Gustavo.
>>
>>
>> Best regards,
>> --
>> Michihiro,
>> IBM Research - Tokyo
>>
>> Inactive hide details for "Doerr, Martin" ---2018/11/22 01:33:34---Hi
>> Michihiro, thanks. This proposal makes the javacode much"Doerr,
>> Martin" ---2018/11/22 01:33:34---Hi Michihiro, thanks. This proposal
>> makes the java code much moreintrinsics friendly.
>>
>> From: "Doerr, Martin" <martin.do...@sap.com>
>> To: Michihiro Horie <ho...@jp.ibm.com>,
>> "core-libs-dev@openjdk.java.net" <core-libs-dev@openjdk.java.net>
>> Cc: "hotspot-compiler-...@openjdk.java.net"
>> <hotspot-compiler-...@openjdk.java.net>, "Simonis,
>> Volker"<volker.simo...@sap.com>, Gustavo Romero
>> <grom...@linux.vnet.ibm.com>
>> Date: 2018/11/22 01:33
>> Subject: RE: 8213754: PPC64: Add Intrinsics for
>> isDigit/isLowerCase/isUpperCase/isWhitespace
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>>
>> Hi Michihiro,
>>
>> thanks. This proposal makes the java code much more intrinsics friendly.
>> We should wait for feedback from the core lib folks. Maybe they have
>> some requirements or other proposals.
>>
>> I think these intrinsics should be interesting for Oracle, Intel and
>> others, too.
>>
>> Btw. I think you can further simplify the code in library_call.cpp
>> (no phi). But we can discuss details when thedirection regarding the
>> java classes is clear.
>>
>> Best regards,
>> Martin
>>
>> *
>> **From:*Michihiro Horie <ho...@jp.ibm.com>*
>> **Sent:*Mittwoch, 21. November 2018 17:14*
>> **To:*core-libs-dev@openjdk.java.net; Doerr, Martin
>> <martin.do...@sap.com>*
>> **Cc:*hotspot-compiler-...@openjdk.java.net; Simonis, Volker
>> <volker.simo...@sap.com>; Gustavo Romero<grom...@linux.vnet.ibm.com>*
>> **Subject:*RE: 8213754: PPC64: Add Intrinsics for
>> isDigit/isLowerCase/isUpperCase/isWhitespace
>>
>> Hi Martin,
>>
>> I send this RFR to core-libs-dev too, because it changes the class
>> library.
>>
>> Through trial and error, I separated Latin1 block as in the _
>> __https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Emhorie_8213754_webrev.01-5F&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=K5GMqTdhkNMaMnGAqGucn9CsZllScUGvHO427jiTC5E&s=brFCGq8BK0Q6ICnXLNCUF0nI8J5LjSjhtiWAZlhjPb4&e=
>> <http://cr.openjdk.java.net/%7Emhorie/8213754/webrev.01>_/_
>>
>> I followed the coding way of Character.isWhitespace() that invokes
>> each ChracterData’s isWhitespace() to refactorisDigit(),
>> isLowerCase(), and isUpperCase().
>>
>> I think this change is also useful on x86, using STTNI.
>>
>>
>> Best regards,
>> --
>> Michihiro,
>> IBM Research - Tokyo
>>
>>
>> ----- Original message -----
>> From: "Michihiro Horie" <_ho...@jp.ibm.com_ <mailto:ho...@jp.ibm.com>>
>> Sent by: "hotspot-compiler-dev"
>> <_hotspot-compiler-dev-boun...@openjdk.java.net_<mailto:hotspot-compiler-dev-boun...@openjdk.java.net>>
>> To: "Doerr, Martin" <_martin.doerr@sap.com_
>> <mailto:martin.do...@sap.com>>
>> Cc: "Simonis, Volker" <_volker.simonis@sap.com_
>> <mailto:volker.simo...@sap.com>>,
>> "_ppc-aix-port-...@openjdk.java.net_<mailto:ppc-aix-port-...@openjdk.java.net>" >> <_ppc-aix-port-...@openjdk.java.net_<mailto:ppc-aix-port-...@openjdk.java.net>>, >> "_hotspot-compiler-...@openjdk.java.net_<mailto:hotspot-compiler-...@openjdk.java.net>" >> <_hotspot-compiler-...@openjdk.java.net_<mailto:hotspot-compiler-...@openjdk.java.net>>
>>
>> Subject: RE: 8213754: PPC64: Add Intrinsics for
>> isDigit/isLowerCase/isUpperCase/isWhitespace
>> Date: Wed, Nov 21, 2018 1:53 AM
>>
>> Hi Martin,
>>
>> Thank you for giving your helpful comments. I did not recognize
>> “generate_method_call_static” prevents anyoptimizations, but I now
>> checked it actually degraded the performance, thanks.
>>
>>  >Please note that we don’t have a machine, yet. So other people will
>> have to test.
>> I think Gustavo can help testing this change when its' ready.
>>
>>  >Would it be possible to introduce more fine-grained intrinsics such
>> that the “slow” path is outside of them?
>>  >
>>  >Maybe you can factor out as in the following example?
>>  >if (latin1) return isLatin1Digit(codePoint);
>>  >with isLatin1Digit as HotSpotIntrinsicCandidate.
>> Thanks for an example, please let me try to separate the Latin block
>> from other blocks for some time.
>>
>>
>> Best regards,
>> --
>> Michihiro,
>> IBM Research - Tokyo
>>
>> Inactive hide details for "Doerr, Martin" ---2018/11/20 01:55:27---Hi
>> Michihiro, first of all, thanks for working onPower9 opt"Doerr,
>> Martin" ---2018/11/20 01:55:27---Hi Michihiro, first of all, thanks
>> for working on Power9optimizations. Please note that we don't ha
>>
>> From: "Doerr, Martin" <_martin.doerr@sap.com_
>> <mailto:martin.do...@sap.com>>
>> To: Michihiro Horie <_ho...@jp.ibm.com_ <mailto:ho...@jp.ibm.com>>,
>> "_hotspot-compiler-...@openjdk.java.net_<mailto:hotspot-compiler-...@openjdk.java.net>" >> <_hotspot-compiler-...@openjdk.java.net_<mailto:hotspot-compiler-...@openjdk.java.net>>, >> "_ppc-aix-port-...@openjdk.java.net_<mailto:ppc-aix-port-...@openjdk.java.net>"
>> <_ppc-aix-port-...@openjdk.java.net_
>> <mailto:ppc-aix-port-...@openjdk.java.net>>
>> Cc: "Simonis, Volker" <_volker.simonis@sap.com_
>> <mailto:volker.simo...@sap.com>>, "Lindenmaier,
>> Goetz"<_goetz.lindenmaier@sap.com_
>> <mailto:goetz.lindenma...@sap.com>>, Gustavo Romero
>> <_grom...@linux.vnet.ibm.com_<mailto:grom...@linux.vnet.ibm.com>>
>> Date: 2018/11/20 01:55
>> Subject: RE: 8213754: PPC64: Add Intrinsics for
>> isDigit/isLowerCase/isUpperCase/isWhitespace
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>>
>>
>> Hi Michihiro,
>>
>> first of all, thanks for working on Power9 optimizations. Please note
>> that we don’t have a machine, yet. So other peoplewill have to test.
>>
>> I think it may be problematic to insert a slow path by
>> “generate_method_call_static”. This may be a performancedisadvantage
>> for some users of other encodings because your intrinsics prevent
>> inlining and further optimizations.
>> Would it be possible to introduce more fine-grained intrinsics such
>> that the “slow” path is outside of them?
>>
>> Maybe you can factor out as in the following example?
>> if (latin1) return isLatin1Digit(codePoint);
>> with isLatin1Digit as HotSpotIntrinsicCandidate.
>>
>> I can’t judge if this is needed, but I think this should be discussed
>> first before going into the details.
>>
>> Best regards,
>> Martin
>>
>> *
>> **From:*Michihiro Horie <_ho...@jp.ibm.com_ <mailto:ho...@jp.ibm.com>>*
>> **Sent:*Freitag, 16. November 2018 12:53*
>> **To:*_hotspot-compiler-...@openjdk.java.net_
>> <mailto:hotspot-compiler-...@openjdk.java.net>;_ppc-aix-port-...@openjdk.java.net_
>> <mailto:ppc-aix-port-...@openjdk.java.net>*
>> **Cc:*Doerr, Martin <_martin.doerr@sap.com_
>> <mailto:martin.do...@sap.com>>; Simonis, Volker
>> <_volker.simonis@sap.com_<mailto:volker.simo...@sap.com>>;
>> Lindenmaier, Goetz <_goetz.lindenmaier@sap.com_
>> <mailto:goetz.lindenma...@sap.com>>;Gustavo Romero
>> <_grom...@linux.vnet.ibm.com_ <mailto:grom...@linux.vnet.ibm.com>>*
>> **Subject:*RFR: 8213754: PPC64: Add Intrinsics for
>> isDigit/isLowerCase/isUpperCase/isWhitespace
>>
>> Dear all,
>>
>> Would you please review following change?
>>
>> Bug: _https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8213754-5F&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=K5GMqTdhkNMaMnGAqGucn9CsZllScUGvHO427jiTC5E&s=GdBakB_GPQKO3fiM6o8w1OpCp76EtuynzAOZaE-OoQQ&e= >> Webrev: _https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Emhorie_8213754_webrev.00-5F&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=K5GMqTdhkNMaMnGAqGucn9CsZllScUGvHO427jiTC5E&s=RqwTJXnDC8QoRx_IsFshdedeLrlfJU110wDJcjzjw2c&e=
>> <http://cr.openjdk.java.net/%7Emhorie/8213754/webrev.00>
>>
>> This change includes the intrinsics of Character isDigit,
>> isLowerCase, isUpperCase, and isWhitespace to support theLatin1 block
>> using POWER9’s instructions cmprb and cmpeqb. The cmprb enables to
>> compare a character with 1 or 2 rangedbytes, while the cmpeqb
>> compares one with 1 to 8 values. Simple micro benchmark attached
>> showed improvements by 20-40%.
>> /
>> //(See attached file: Latin1Test.java)/
>>
>>
>> Best regards,
>> --
>> Michihiro,
>> IBM Research - Tokyo
>>
>>







Reply via email to