On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu <ji...@openjdk.org> wrote:

> Hi all,
> 
> According to the Vector API doc, the `LSHR` operator computes 
> `a>>>(n&(ESIZE*8-1))`.
> However, current implementation is incorrect for negative bytes/shorts.
> 
> The background is that one of our customers try to vectorize `urshift` with 
> `urshiftVector` like the following.
> 
>  13     public static void urshift(byte[] src, byte[] dst) {
>  14         for (int i = 0; i < src.length; i++) {
>  15             dst[i] = (byte)(src[i] >>> 3);
>  16         }
>  17     }
>  18 
>  19     public static void urshiftVector(byte[] src, byte[] dst) {
>  20         int i = 0;
>  21         for (; i < spec.loopBound(src.length); i +=spec.length()) {
>  22             var va = ByteVector.fromArray(spec, src, i);
>  23             var vb = va.lanewise(VectorOperators.LSHR, 3);
>  24             vb.intoArray(dst, i);
>  25         }
>  26 
>  27         for (; i < src.length; i++) {
>  28             dst[i] = (byte)(src[i] >>> 3);
>  29         }
>  30     }
> 
> 
> Unfortunately and to our surprise, code@line28 computes different results 
> with code@line23.
> It took quite a long time to figure out this bug.
> 
> The root cause is that current implemenation of Vector API can't compute the  
> unsigned right shift results as what is done for scalar `>>>` for negative 
> byte/short elements.
> Actually, current implementation will do `(a & 0xFF) >>> (n & 7)` [1] for all 
> bytes, which is unable to compute the vectorized `>>>` for negative bytes.
> So this seems unreasonable and unfriendly to Java developers.
> It would be better to fix it.
> 
> The key idea to support unsigned right shift of negative bytes/shorts is just 
> to replace the unsigned right shift operation with the signed right shift 
> operation.
> This logic is:
> - For byte elements, unsigned right shift is equal to signed right shift if 
> the shift_cnt <= 24.
> - For short elements, unsigned right shift is equal to signed right shift if 
> the shift_cnt <= 16.
> - For Vector API, the shift_cnt will be masked to shift_cnt <= 7 for bytes 
> and shift_cnt <= 15 for shorts.
> 
> I just learned this idea from https://github.com/openjdk/jdk/pull/7979 .
> And many thanks to @fg1417 .
> 
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] 
> https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java#L935

Hi,

The `>>>` operator is not defined for subword types, what the code in line 28 
does vs what it is supposed to do is different, which is more likely the bug 
here. An unsigned shift should operate on subword types the same as it does on 
word and double-word types, which is to zero extend the value before shifting 
it rightwards.

Another argument would be that an unsigned shift operates on the unsigned 
types, and the signed cast exposes this misunderstanding regarding the 
operation.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8276

Reply via email to