Re: use ARM intrinsics in pg_lfind32() where available

John Naylor Fri, 26 Aug 2022 23:59:35 -0700

On Sat, Aug 27, 2022 at 1:24 AM Nathan Bossart <nathandboss...@gmail.com> wrote:
>
> Here is a rebased patch set that applies to HEAD.


0001:

 #define USE_NO_SIMD
 typedef uint64 Vector8;
+typedef uint64 Vector32;
 #endif

I don't forsee any use of emulating vector registers with uint64 if
they only hold two ints. I wonder if it'd be better if all vector32
functions were guarded with #ifndef NO_USE_SIMD. (I wonder if
declarations without definitions cause warnings...)

+ * NB: This function assumes that each lane in the given vector either has all
+ * bits set or all bits zeroed, as it is mainly intended for use with
+ * operations that produce such vectors (e.g., vector32_eq()).  If this
+ * assumption is not true, this function's behavior is undefined.
+ */

Hmm?

Also, is_highbit_set() already has uses same intrinsic and has the
same intended effect, since we only care about the boolean result.

0002:

-#elif defined(USE_SSE2)
+#elif defined(USE_SSE2) || defined(USE_NEON)

I think we can just say #else.

-#if defined(USE_SSE2)
- __m128i sub;
+#ifndef USE_NO_SIMD
+ Vector8 sub;

+#elif defined(USE_NEON)
+
+ /* use the same approach as the USE_SSE2 block above */
+ sub = vqsubq_u8(v, vector8_broadcast(c));
+ result = vector8_has_zero(sub);

I think we should invent a helper that does saturating subtraction and
call that, inlining the sub var so we don't need to mess with it
further.

Otherwise seems fine.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: use ARM intrinsics in pg_lfind32() where available

Reply via email to