Here is an optimized kernel ffs(3) for arm64.

I blame dlg@ for making me scrutinize the POWER7 instruction set,
which led me to the clz instruction, which led me to ffs().  I
wanted to add this to libc, but then realized the futility because
the compiler already inlines its optimized copy of ffs().  However,
this optimization is disabled for the kernel with -ffreestanding.

Honestly, I'm not sure I can claim to have written this.  The elegant
version below is adapted from clang output, because the compiler
is smarter than I am.

More archs to come...

Comments?  OK?

Index: lib/libkern/arch/arm64/ffs.S
===================================================================
RCS file: lib/libkern/arch/arm64/ffs.S
diff -N lib/libkern/arch/arm64/ffs.S
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ lib/libkern/arch/arm64/ffs.S        8 Jun 2020 20:35:01 -0000
@@ -0,0 +1,17 @@
+/*     $OpenBSD$ */
+/*
+ * Written by Christian Weisgerber <[email protected]>.
+ * Public domain.
+ */
+
+#include <machine/asm.h>
+
+ENTRY(ffs)
+       RETGUARD_SETUP(ffs, x15)
+       rbit    w1, w0
+       clz     w1, w1
+       cmp     w0, #0
+       csinc   w0, wzr, w1, eq
+       RETGUARD_CHECK(ffs, x15)
+       ret
+END(ffs)
-- 
Christian "naddy" Weisgerber                          [email protected]

Reply via email to