Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-02-02 Thread Ulf Zibis
Hi Claes, sorry for the delay, I didn't catch you message before, because you CC'd me which causes that my email filter misses the ListID tag to catch. Thanks for the update, times change with the compact strings feature. Another approach could be to access the String constant via Unsafe. -U

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-30 Thread Paul Sandoz
OK! A surprising increase in code size. Thanks for checking. Paul. > On Jan 30, 2018, at 2:40 AM, Claes Redestad wrote: > > The ASM is harder than usual to follow and compare since everything is > inlined aggressively, but it seems that since CharacterDataLatin1 is only > invoked for 0 <= ch <

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-30 Thread Xueming Shen
+1 On Jan 30, 2018, at 2:40 AM, Claes Redestad wrote: The ASM is harder than usual to follow and compare since everything is inlined aggressively, but it seems that since CharacterDataLatin1 is only invoked for 0 <= ch < 256 (invariant established in CharacterData.of(int ch)) then the compiler i

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-30 Thread Claes Redestad
The ASM is harder than usual to follow and compare since everything is inlined aggressively, but it seems that since CharacterDataLatin1 is only invoked for 0 <= ch < 256 (invariant established in CharacterData.of(int ch)) then the compiler is able to elide bounds check entirely when the byte[]

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-30 Thread Claes Redestad
Hi, not sure what you're suggesting, exactly, but it seems Charset coders generate String literals which are then unpacked to char[]/byte[]'s in static blocks. While constructing arrays from String literals might have been a startup optimization in the past, it is now mainly a workaround to m

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Ulf Zibis
Hi, you may can construct the lookup table as a static String constant to slim down the footprint and treat it as a byte[] as we do in the Charset coders. -Ulf Am 29.01.2018 um 22:04 schrieb Claes Redestad: I'll experiment and analyze a bit more tomorrow to see if I can find a good explana

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Claes Redestad
Right, I can't really explain why, but the effect is very visible and reproducible in microbenchmarks. Differences in generated ASM might be indicating that the inlining behavior changes enough to shift the result around; maybe a job for @ForceInline? I'll experiment and analyze a bit more tomorr

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Paul Sandoz
Smaller in only the upper bound? I would an explicit upper bounds check would dominate that of the bounds check for the array itself. Paul. > On Jan 29, 2018, at 11:37 AM, Claes Redestad > wrote: > > I ran with a smaller byte[] initially and saw about a 10% improvement from > removing the br

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Claes Redestad
I ran with a smaller byte[] initially and saw about a 10% improvement from removing the branch, which is why I felt the superfluous bytes were motivated. /Claes Paul Sandoz skrev: (29 januari 2018 19:14:44 CET) > > >> On Jan 29, 2018, at 9:44 AM, Martin Buchholz >wrote: >> >> Thanks. I might

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Paul Sandoz
> On Jan 29, 2018, at 9:44 AM, Martin Buchholz wrote: > > Thanks. I might try to shrink the size of the static array, by only > storing values between '0' and 'z', at the cost of slightly greater lookup > costs for each char. > I was wondering the same, or just clip the end of the array to’z

Re: RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Martin Buchholz
Thanks. I might try to shrink the size of the static array, by only storing values between '0' and 'z', at the cost of slightly greater lookup costs for each char. On Mon, Jan 29, 2018 at 3:15 AM, Claes Redestad wrote: > Hi, > > for the latin1 block of CharacterData we can improve the > digit(i

RFR: 8196331: Optimize Character.digit for latin1 input

2018-01-29 Thread Claes Redestad
Hi, for the latin1 block of CharacterData we can improve the digit(int, int) implementation by providing an optimized lookup table. This improves microbenchmarks exercising Integer.parseInt, Long.parseLong and UUID.fromString etc by around 50%for typical inputs. Webrev: http://cr.openjdk.java.ne