On Mon, 18 Apr 2022 05:09:33 GMT, Jie Fu <ji...@openjdk.org> wrote:

>>> However, just image that someone would like to optimize some code segments 
>>> of bytes/shorts `>>>`
>> 
>> Then that person can just use signed shift (`VectorOperators.ASHR`), right? 
>> Shifting on masked shift counts means that the shift count cannot be greater 
>> than 8 for bytes and 16 for shorts, which means that `(byte)(src[i] >>> 3)` 
>> is exactly the same as `(byte)(src[i] >> 3)`. Please correct me if I 
>> misunderstood something here.
>> 
>> Your proposed changes make unsigned shifts for subwords behave exactly the 
>> same as signed shifts, which is both redundant (we have 2 operations doing 
>> exactly the same thing) and inadequate (we lack the operation to do the 
>> proper unsigned shifts)
>> 
>> Thank you very much.
>
>> > However, just image that someone would like to optimize some code segments 
>> > of bytes/shorts `>>>`
>> 
>> Then that person can just use signed shift (`VectorOperators.ASHR`), right? 
>> Shifting on masked shift counts means that the shift count cannot be greater 
>> than 8 for bytes and 16 for shorts, which means that `(byte)(src[i] >>> 3)` 
>> is exactly the same as `(byte)(src[i] >> 3)`. Please correct me if I 
>> misunderstood something here.
> 
> Yes, you're right that's why I said `LSHR` can be replaced with `ASHR`.
> 
> However, not all the people are clever enough to do this source code level 
> replacement.
> To be honest, I would never think of that `>>>` can be auto-vectorized by 
> this idea proposed by https://github.com/openjdk/jdk/pull/7979 .
> 
>> 
>> Your proposed changes make unsigned shifts for subwords behave exactly the 
>> same as signed shifts, which is both redundant (we have 2 operations doing 
>> exactly the same thing) and inadequate (we lack the operation to do the 
>> proper unsigned shifts)
> 
> `LSHR` following the behavior of scalar `>>>` is very important for Java 
> developers to rewrite the code with vector api.
> Maybe, we should add one more operator to support what you called `the proper 
> unsigned shifts`, right?
> But that's another topic which can be done in a separate issue.

> @DamonFool
> 
> I think the issue is that these two cases of yours are not equal semantically.

Why?
According to the vector api doc, they should compute the same value when the 
shift_cnt is 3, right?

> 
> ```
>  13     public static void urshift(byte[] src, byte[] dst) {
>  14         for (int i = 0; i < src.length; i++) {
>  15             dst[i] = (byte)(src[i] >>> 3);
>  16         }
>  17     }
>  18 
>  19     public static void urshiftVector(byte[] src, byte[] dst) {
>  20         int i = 0;
>  21         for (; i < spec.loopBound(src.length); i +=spec.length()) {
>  22             var va = ByteVector.fromArray(spec, src, i);
>  23             var vb = va.lanewise(VectorOperators.LSHR, 3);
>  24             vb.intoArray(dst, i);
>  25         }
>  26 
>  27         for (; i < src.length; i++) {
>  28             dst[i] = (byte)(src[i] >>> 3);
>  29         }
>  30     }
> ```
> 
> Besides the unsigned shift, line15 also has a type conversion which is 
> missing in the vector api case. To get the equivalent result, one need to 
> cast the result explicitly at line24, e.g, 
> `((IntVector)vb.castShape(SPECISE_XXX, 0)).intoArray(idst, i);`

Since all the vector operations are already based on byte lane type, I don't 
think we need a `cast` operation here.
Can we solve this problem by inserting a cast operation?

-------------

PR: https://git.openjdk.java.net/jdk/pull/8276

Reply via email to