On Mon, 18 Apr 2022 05:14:25 GMT, Jie Fu <ji...@openjdk.org> wrote: >>> > However, just image that someone would like to optimize some code >>> > segments of bytes/shorts `>>>` >>> >>> Then that person can just use signed shift (`VectorOperators.ASHR`), right? >>> Shifting on masked shift counts means that the shift count cannot be >>> greater than 8 for bytes and 16 for shorts, which means that `(byte)(src[i] >>> >>> 3)` is exactly the same as `(byte)(src[i] >> 3)`. Please correct me if >>> I misunderstood something here. >> >> Yes, you're right that's why I said `LSHR` can be replaced with `ASHR`. >> >> However, not all the people are clever enough to do this source code level >> replacement. >> To be honest, I would never think of that `>>>` can be auto-vectorized by >> this idea proposed by https://github.com/openjdk/jdk/pull/7979 . >> >>> >>> Your proposed changes make unsigned shifts for subwords behave exactly the >>> same as signed shifts, which is both redundant (we have 2 operations doing >>> exactly the same thing) and inadequate (we lack the operation to do the >>> proper unsigned shifts) >> >> `LSHR` following the behavior of scalar `>>>` is very important for Java >> developers to rewrite the code with vector api. >> Maybe, we should add one more operator to support what you called `the >> proper unsigned shifts`, right? >> But that's another topic which can be done in a separate issue. > >> @DamonFool >> >> I think the issue is that these two cases of yours are not equal >> semantically. > > Why? > According to the vector api doc, they should compute the same value when the > shift_cnt is 3, right? > >> >> ``` >> 13 public static void urshift(byte[] src, byte[] dst) { >> 14 for (int i = 0; i < src.length; i++) { >> 15 dst[i] = (byte)(src[i] >>> 3); >> 16 } >> 17 } >> 18 >> 19 public static void urshiftVector(byte[] src, byte[] dst) { >> 20 int i = 0; >> 21 for (; i < spec.loopBound(src.length); i +=spec.length()) { >> 22 var va = ByteVector.fromArray(spec, src, i); >> 23 var vb = va.lanewise(VectorOperators.LSHR, 3); >> 24 vb.intoArray(dst, i); >> 25 } >> 26 >> 27 for (; i < src.length; i++) { >> 28 dst[i] = (byte)(src[i] >>> 3); >> 29 } >> 30 } >> ``` >> >> Besides the unsigned shift, line15 also has a type conversion which is >> missing in the vector api case. To get the equivalent result, one need to >> cast the result explicitly at line24, e.g, >> `((IntVector)vb.castShape(SPECISE_XXX, 0)).intoArray(idst, i);` > > Since all the vector operations are already based on byte lane type, I don't > think we need a `cast` operation here. > Can we solve this problem by inserting a cast operation?
@DamonFool `>>>` can not apply to sub-word type in Java. `(byte)(src[i] >>> 3)` is unsigned right shift in type of INT and transformed the result to BYTE. In vector api, it extends the `>>>` to sub-word type with the same semantic meaning like `iushr`[1], that is zero extending. > The vector api docs says it would compute a>>>(n&(ESIZE*8-1)). I think `>>>` has some extending meanings here. As I said above, no sub-word type for `>>>` in Java. [1] https://docs.oracle.com/javase/specs/jvms/se18/html/jvms-6.html#jvms-6.5.iushr ------------- PR: https://git.openjdk.java.net/jdk/pull/8276