On Monday, 8 December 2014 at 16:32:50 UTC, Martin Nowak wrote:
I want to do bounds checking of 2 (4 on avx) ulongs (64-bit) at a time.

ulong2 vval = [v0, v1];
ulong2 vlow = [low, low];
ulong2 vhigh = [high, high];

int res = PMOVMSKB(vval >= vlow & vval < vhigh);

I figured out sort of a solution, but it seems way too complicated, because there is only signed comparison.

Usually (scalar) I'd use this, which makes use of unsigned wrap to safe one conditional

immutable size = cast(ulong)(vhigh - vlow);
if (cast(ulong)(v0 - vlow) < size) {}
if (cast(ulong)(v1 - vlow) < size) {}

over

if (v0 >= vlow && v0 < vhigh) {}

Maybe this can be used on SIMD too (saturated sub or so)?

-Martin

Another solution, in SSE pseudo-code:

values <- [vval0, vval1] // two value to bound-check, would be 4 in AVX

  low <- [vlow, vlow]
  high <- [vhigh, vhigh]

  subl <- values - low // using the PSUBQ instruction (SSE2)
  subh <- values - high // using the PSUBQ instruction (SSE2)
  mask <- subh andnot subl // using the PANDN instruction (SSE2)

// only logical shift is available for 64-bit integers until AVX-512
  // using PSRLQ to shift the sign bit to the right (SSE2)
  vresult <- mask >> 31 (for each dword)


Now, each 64-bit word in vresult contains exactly 1 if bound were checked, or 0 else.

Reply via email to