On Wed, 1 Feb 2023 14:20:05 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractMask.java
>>  line 236:
>> 
>>> 234:         } else if (offset >= limit) {
>>> 235:             return vectorSpecies().maskAll(false);
>>> 236:         } else if (limit - offset >= length()) {
>> 
>> Can you move this else if check at the top, this is the most general case 
>> appearing in the loop and hence two extra uncommon trap jumps before it for 
>> special cases may penalize this.
>
> Your patch re-organized the code to take care of following cases :-
> 1.   offset < 0                         : Still falling over to old code.
> 2.   offset >= limit                  : Special case optimized 
> 3.   (limit - offset) >= length  :  Most general case.
> 4.   Partial in range check       :  New intrinsic, existing java side 
> implementation in checkIndex0 is also composed of intrinsified calls,  
> infrequently taken paths will anyways lead to uncommon traps.
> 
> Have you tried introducing just case 3 (first) and then case 2 in existing 
> indexInRange implementation.

Thanks for the review @jatin-bhateja !

> Have you tried introducing just case 3 (first) and then case 2 in existing 
> indexInRange implementation.

Yes, I tried with this way as well, together with the performance testing 
compared with current version. The result is:
1. For cases that the array size is aligned with the vector size, the 
performance is almost the same with each other.
2. For cases that the array size is not aligned (which needs the masked 
operation), the performance with current version can improve about 5%~6% 
compared with the way as you suggested. But the improvement is limited to the 
cases with smaller vector size, whose tail loop size is bigger.

So, adding the new intrinsic has no side effect for me, except for the more 
complex java side code and one additional intrinsic.

-------------

PR: https://git.openjdk.org/jdk/pull/12064

Reply via email to