On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu <ji...@openjdk.org> wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize `urshift` with > `urshiftVector` like the following. > > 13 public static void urshift(byte[] src, byte[] dst) { > 14 for (int i = 0; i < src.length; i++) { > 15 dst[i] = (byte)(src[i] >>> 3); > 16 } > 17 } > 18 > 19 public static void urshiftVector(byte[] src, byte[] dst) { > 20 int i = 0; > 21 for (; i < spec.loopBound(src.length); i +=spec.length()) { > 22 var va = ByteVector.fromArray(spec, src, i); > 23 var vb = va.lanewise(VectorOperators.LSHR, 3); > 24 vb.intoArray(dst, i); > 25 } > 26 > 27 for (; i < src.length; i++) { > 28 dst[i] = (byte)(src[i] >>> 3); > 29 } > 30 } > > > Unfortunately and to our surprise, code@line28 computes different results > with code@line23. > It took quite a long time to figure out this bug. > > The root cause is that current implemenation of Vector API can't compute the > unsigned right shift results as what is done for scalar `>>>` for negative > byte/short elements. > Actually, current implementation will do `(a & 0xFF) >>> (n & 7)` [1] for all > bytes, which is unable to compute the vectorized `>>>` for negative bytes. > So this seems unreasonable and unfriendly to Java developers. > It would be better to fix it. > > The key idea to support unsigned right shift of negative bytes/shorts is just > to replace the unsigned right shift operation with the signed right shift > operation. > This logic is: > - For byte elements, unsigned right shift is equal to signed right shift if > the shift_cnt <= 24. > - For short elements, unsigned right shift is equal to signed right shift if > the shift_cnt <= 16. > - For Vector API, the shift_cnt will be masked to shift_cnt <= 7 for bytes > and shift_cnt <= 15 for shorts. > > I just learned this idea from https://github.com/openjdk/jdk/pull/7979 . > And many thanks to @fg1417 . > > > Thanks. > Best regards, > Jie > > [1] > https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java#L935
Hi, According to JLS section 5.8, any operand in a numeric arithmetic context undergoes a promotion to int, long, float or double and the operation is only defined for values of the promoted types. This means that `>>>` is not defined for byte/short values and the real behaviour here is that `src[i]` get promoted to int by a signed cast before entering the unsigned shift operation. This is different from `VectorOperators.LSHR` which is defined for byte/short element types. The scalar code does not do a byte unsigned shift but an int unsigned shift between a promotion and a narrowing cast, the explicit cast `(byte)` notifies the user of this behaviour. Secondly, consistency is the key, having a byte unsigned shift promoted elements to int is not consistent, I can argue why aren't int elements being promoted to longs, or longs being promoted to the 128-bit integral type, too. Finally, as I have mentioned in #7979, this usage of unsigned shift seems more likely to be a bug than an intended behaviour, so we should not bother to optimise this pattern. Thank you very much. ------------- PR: https://git.openjdk.java.net/jdk/pull/8276