On Wed, 14 Oct 2020 21:29:55 GMT, CoreyAshford 
<github.com+51754783+coreyashf...@openjdk.org> wrote:

>> Hi Corey,
>> 
>>> Are you thinking of a case where that produces a higher iteration count?
>> Sorry for the confusion. This is also fine:
>> length = sl - sp - 12
>> i = length / block_size
>> if (i <= 0) return 0
>> 
>> But I still wonder why we should use 2 branches. Why not
>> srawi_
>> ble(CCR0, return_zero)
>> ?
>> 
>>> Ah, I should have checked the calling conventions. I thought all of the CR* 
>>> regs are volatile. I will fix that.
>> Actually, we do save and restore all CRs, so it’s not a real problem with 
>> the current implementation. But I prefer
>> staying closer to the elf ABI as long as there’s no good reason to do it 
>> differently.
>>> Your original comment said "2nd review", so I thought you meant you need to 
>>> review it again after the changes.
>> We usually require at least 2 reviews by different people for all 
>> non-trivial changes. And I don’t consider the PPC64
>> part as trivial. In addition to that, I’m not familiar with Power 10.
>> Best regards,
>> Martin
>> 
>> 
>> From: CoreyAshford <notificati...@github.com>
>> Sent: Dienstag, 13. Oktober 2020 22:59
>> To: openjdk/jdk <j...@noreply.github.com>
>> Cc: Doerr, Martin <martin.do...@sap.com>; Mention 
>> <ment...@noreply.github.com>
>> Subject: Re: [openjdk/jdk] 8248188: Add IntrinsicCandidate and API for 
>> Base64 decoding (#293)
>> 
>> 
>> Hi Corey, thanks for taking some stuff out of the “too short” path. There 
>> may be a performance regression when decoding
>> many short arrays because of the stub call overhead and the usage of the 
>> slower part of the Java implementation. We
>> could do it a little better in many cases to compute the maximum possible 
>> iteration count i: i = (sl - sp) / block_size
>> if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think?  
>> Are you thinking of a case where that
>> produces a higher iteration count? It looks effectively the same to me.  I 
>> don’t think branch prediction hints are
>> helpful for the “too short” check.
>> My thinking is that most of the time when the intrinsic is called, it will 
>> not take the early exit, but I suppose when
>> it is processing a sub-block_size buffer, it will return early every time. I 
>> will remove the hints.
>> And we should better use CCR1 instead of CCR2 which is specified as 
>> non-volatile.
>> 
>> Ah, I should have checked the calling conventions. I thought all of the CR* 
>> regs are volatile. I will fix that.
>> 
>> Did you already find a 2nd reviewer for the PPC64 part?
>> 
>> Your original comment said "2nd review", so I thought you meant you need to 
>> review it again after the changes. So, no,
>> I haven't looked for or found a second reviewer. Any suggestions? The folks 
>> on the team here have been busy with other
>> work.  Btw, I'm off today, so I will push commits to the above-mentioned 
>> issues tomorrow.
>> 
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on 
>> GitHub<https://github.com/openjdk/jdk/pull/293#issuecomment-708005545>, or
>> unsubscribe<https://github.com/notifications/unsubscribe-auth/AKR64KC2K7IZFAPXVQIVYZDSKS5SLANCNFSM4RVHNW5Q>.
>
>> Hi Corey,
>> Are you thinking of a case where that produces a higher iteration count?
>> Sorry for the confusion. This is also fine: length = sl - sp - 12 i = length 
>> / block_size if (i <= 0) return 0 But I
>> still wonder why we should use 2 branches. Why not srawi_ ble(CCR0, 
>> return_zero) ?
> 
> You're right!  I originally thought that the `srawi.` was setting only the 
> Zero bit, but it sets others as well.
> 
>> Ah, I should have checked the calling conventions. I thought all of the CR* 
>> regs are volatile. I will fix that.
>> Actually, we do save and restore all CRs, so it’s not a real problem with 
>> the current implementation. But I prefer
>> staying closer to the elf ABI as long as there’s no good reason to do it 
>> differently.
> 
> Looks like I don't need that code at all now, but it's good to know for the 
> future; I have an encode intrinsic in the
> works.
>> Your original comment said "2nd review", so I thought you meant you need to 
>> review it again after the changes.
>> We usually require at least 2 reviews by different people for all 
>> non-trivial changes. And I don’t consider the PPC64
>> part as trivial. In addition to that, I’m not familiar with Power 10.
> 
> I received permission to request help from the GNU toolchain team here to 
> review it.  Due to family issues and work
> schedule on my end, it will be at least the middle of next week before I can 
> get a reviewer to have a look.
> Thanks for your continued patience and help.

Please update
[compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java),
and add the intrinsic signature.

-------------

PR: https://git.openjdk.java.net/jdk/pull/293

Reply via email to