On Mon, 18 Sep 2023 13:51:06 GMT, Chen Liang <li...@openjdk.org> wrote:

>> In the improvement of @cl4es PR #15591, the advantages of non-lookup-table 
>> were discussed.
>> 
>> But if the input is byte[], using lookup table can improve performance.
>> 
>> For HexFormat#formatHex(Appendable, byte[]) and HexFormat#formatHex(byte[]), 
>> If the length of byte[] is larger, the performance of table lookup will be 
>> improved more obviously.
>
> src/java.base/share/classes/java/util/HexFormat.java line 664:
> 
>> 662:     public char toHighHexDigit(int value) {
>> 663:         value = (value >> 4) & 0xf;
>> 664:         return (char) ((value < 10 ? '0' : hexBase) + value);
> 
> These changes seem to be the reason `toHexDigitsLong` consistently sees a 
> slight performance drop. Can any professional engineer explain why the old 
> version, which would be like `(char) (value < 10 ? '0' + value : hexBase + 
> value)`, is slightly faster?
> 
> Since the effect of this change `hexBase` is not clear, I recommend dropping 
> this for now (remove the field and changes to the 2 methods).

Yes, please drop this. Possibly re-attempting it in a future PR, though I'm a 
bit skeptical of the potential win here.

I did test something like `hexBase + value` in #15591 to a similar effect, so 
opted for just storing `digitCase`. My best guess as to why are that the JIT is 
simply better at (or more eager to) eliminating untaken paths than it is at 
constant fold the field load. Or the branch isn't eliminated and we're seeing 
the effect of branch prediction. I'd recommend analyzing the asm generated (via 
e.g. `-prof perfasm`) to understand this in depth

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15768#discussion_r1328842647

Reply via email to