+1

On Jan 30, 2018, at 2:40 AM, Claes Redestad <claes.redes...@oracle.com> wrote:

The ASM is harder than usual to follow and compare since everything is
inlined aggressively, but it seems that since CharacterDataLatin1 is only
invoked for 0 <= ch < 256 (invariant established in CharacterData.of(int ch))
then the compiler is able to elide bounds check entirely when the byte[] is
also 256 elements.

Shrinking the array adds more branches and grows the compiled code size
for UUID.fromString from 751 to 1341 bytes, so it seems that even from a
footprint perspective then keeping the array at 256 elements is a win. :-)

/Claes

> On 2018-01-29 22:04, Claes Redestad wrote:
> Right, I can't really explain why, but the effect is very visible and
> reproducible in microbenchmarks. Differences in generated ASM might
> be indicating that the inlining behavior changes enough to shift the
> result around; maybe a job for @ForceInline?
> 
> I'll experiment and analyze a bit more tomorrow to see if I can find a
> good explanation for the observed difference and/or a solution that
> allows us to slim down the lookup array.
> 
> /Claes
> 
>> On 2018-01-29 20:38, Paul Sandoz wrote:
>> Smaller in only the upper bound? I would an explicit upper bounds check 
>> would dominate that of the bounds check for the array itself.
>> 
>> Paul.
>> 
>>> On Jan 29, 2018, at 11:37 AM, Claes Redestad <claes.redes...@oracle.com 
>>> <mailto:claes.redes...@oracle.com>> wrote:
>>> 
>>> I ran with a smaller byte[] initially and saw about a 10% improvement from 
>>> removing the branch, which is why I felt the superfluous bytes were 
>>> motivated.
>>> 
>>> /Claes
>>> 
>>> Paul Sandoz <paul.san...@oracle.com <mailto:paul.san...@oracle.com>> skrev: 
>>> (29 januari 2018 19:14:44 CET)
>>> 
>>> 
>>>         On Jan 29, 2018, at 9:44 AM, Martin Buchholz
>>>         <marti...@google.com <mailto:marti...@google.com>> wrote:
>>>         Thanks. I might try to shrink the size of the static array,
>>>         by only storing values between '0' and 'z', at the cost of
>>>         slightly greater lookup costs for each char.
>>> 
>>>     I was wondering the same, or just clip the end of the array to’z'
>>> 
>>>     if (ch <= ‘z’ && radix …) { // Might even fold the upper bounds check 
>>> for DIGITS
>>>        value = DIGITS[ch];
>>>        ...
>>>     }
>>> 
>>>     Paul.
>>> 
>>>         On Mon, Jan 29, 2018 at 3:15 AM, Claes Redestad
>>>         <claes.redes...@oracle.com
>>>         <mailto:claes.redes...@oracle.com>> wrote:
>>> 
>>>             Hi, for the latin1 block of CharacterData we can improve
>>>             the digit(int, int) implementation by providing an
>>>             optimized lookup table. This improves microbenchmarks
>>>             exercising Integer.parseInt, Long.parseLong and
>>>             UUID.fromString etc by around 50%for typical inputs.
>>>             Webrev:
>>> http://cr.openjdk.java.net/~redestad/8196331/open.00/
>>> <http://cr.openjdk.java.net/%7Eredestad/8196331/open.00/>
>>>             Bug: https://bugs.openjdk.java.net/browse/JDK-8196331 The
>>>             lookup array is pre-calculated to minimize startup impact
>>>             (adds 1,027 bytecodes executed during initialization) /Claes
>>> 
>>> 
>>> -- 
>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>> 
> 


Reply via email to