Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-10 Thread Mark Kettenis
> Date: Wed, 10 Jun 2020 20:08:31 +0200
> From: Christian Weisgerber 
> 
> Next try.
> Optimized versions for kernel ffs(3) on arm64, powerpc, powerpc64.
> 
> I have tested arm64; cwen@ has tested powerpc in userland.
> powerpc64 is copied from powerpc.
> 
> ok?

ok kettenis@

> Index: lib/libkern/arch/arm64/ffs.S
> ===
> RCS file: lib/libkern/arch/arm64/ffs.S
> diff -N lib/libkern/arch/arm64/ffs.S
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libkern/arch/arm64/ffs.S  10 Jun 2020 17:38:50 -
> @@ -0,0 +1,17 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> + 
> +#include 
> +
> +ENTRY(ffs)
> + RETGUARD_SETUP(ffs, x15)
> + rbitw1, w0
> + clz w1, w1
> + cmp w0, wzr
> + csinc   w0, wzr, w1, eq
> + RETGUARD_CHECK(ffs, x15)
> + ret
> +END(ffs)
> Index: lib/libkern/arch/powerpc/ffs.S
> ===
> RCS file: lib/libkern/arch/powerpc/ffs.S
> diff -N lib/libkern/arch/powerpc/ffs.S
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libkern/arch/powerpc/ffs.S10 Jun 2020 17:39:02 -
> @@ -0,0 +1,15 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> + 
> +#include 
> +
> +ENTRY(ffs)
> + neg %r4, %r3
> + and %r3, %r3, %r4
> + cntlzw  %r3, %r3
> + subfic  %r3, %r3, 32
> + blr
> +END(ffs)
> Index: lib/libkern/arch/powerpc64/ffs.S
> ===
> RCS file: lib/libkern/arch/powerpc64/ffs.S
> diff -N lib/libkern/arch/powerpc64/ffs.S
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libkern/arch/powerpc64/ffs.S  10 Jun 2020 17:39:06 -
> @@ -0,0 +1,15 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> + 
> +#include 
> +
> +ENTRY(ffs)
> + neg %r4, %r3
> + and %r3, %r3, %r4
> + cntlzw  %r3, %r3
> + subfic  %r3, %r3, 32
> + blr
> +END(ffs)
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de
> 
> 



Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-10 Thread Christian Weisgerber
Next try.
Optimized versions for kernel ffs(3) on arm64, powerpc, powerpc64.

I have tested arm64; cwen@ has tested powerpc in userland.
powerpc64 is copied from powerpc.

ok?


Index: lib/libkern/arch/arm64/ffs.S
===
RCS file: lib/libkern/arch/arm64/ffs.S
diff -N lib/libkern/arch/arm64/ffs.S
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libkern/arch/arm64/ffs.S10 Jun 2020 17:38:50 -
@@ -0,0 +1,17 @@
+/* $OpenBSD$ */
+/*
+ * Written by Christian Weisgerber .
+ * Public domain.
+ */
+ 
+#include 
+
+ENTRY(ffs)
+   RETGUARD_SETUP(ffs, x15)
+   rbitw1, w0
+   clz w1, w1
+   cmp w0, wzr
+   csinc   w0, wzr, w1, eq
+   RETGUARD_CHECK(ffs, x15)
+   ret
+END(ffs)
Index: lib/libkern/arch/powerpc/ffs.S
===
RCS file: lib/libkern/arch/powerpc/ffs.S
diff -N lib/libkern/arch/powerpc/ffs.S
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libkern/arch/powerpc/ffs.S  10 Jun 2020 17:39:02 -
@@ -0,0 +1,15 @@
+/* $OpenBSD$ */
+/*
+ * Written by Christian Weisgerber .
+ * Public domain.
+ */
+ 
+#include 
+
+ENTRY(ffs)
+   neg %r4, %r3
+   and %r3, %r3, %r4
+   cntlzw  %r3, %r3
+   subfic  %r3, %r3, 32
+   blr
+END(ffs)
Index: lib/libkern/arch/powerpc64/ffs.S
===
RCS file: lib/libkern/arch/powerpc64/ffs.S
diff -N lib/libkern/arch/powerpc64/ffs.S
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libkern/arch/powerpc64/ffs.S10 Jun 2020 17:39:06 -
@@ -0,0 +1,15 @@
+/* $OpenBSD$ */
+/*
+ * Written by Christian Weisgerber .
+ * Public domain.
+ */
+ 
+#include 
+
+ENTRY(ffs)
+   neg %r4, %r3
+   and %r3, %r3, %r4
+   cntlzw  %r3, %r3
+   subfic  %r3, %r3, 32
+   blr
+END(ffs)
-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-09 Thread Theo de Raadt
Mark Kettenis  wrote:

> > Date: Tue, 9 Jun 2020 23:43:50 +0200
> > From: Christian Weisgerber 
> > 
> > Mark Kettenis:
> > 
> > > Unfortunately that doesn't quite work.  At least in my build it
> > > doesn't pick up .c files in the linker/arch directories.
> > > 
> > > > Index: lib/libkern/arch/arm64/ffs.c
> > 
> > I was certain I had checked this, but indeed it doesn't work.
> > 
> > The Makefile rules are generated from sys/conf/files:
> > 
> > ...
> > file lib/libkern/arch/${MACHINE_ARCH}/ffs.S | lib/libkern/ffs.c
> > ...
> > 
> > We could extend this by a third file.  Ugly.
> > Alternatively I could go back to .S, sigh.
> 
> FWIW, .S would be perfectly fine with me.  This is writeonce, forget
> forevere code so maintaining .S files isn't a huge burden.

I agree.

The important thing that would do is keep the build/infrastructure
frameworks simpler.  Overcomplicating those risks "oh but which file
first" types of issues which always burn me first.



Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-09 Thread Mark Kettenis
> Date: Tue, 9 Jun 2020 23:43:50 +0200
> From: Christian Weisgerber 
> 
> Mark Kettenis:
> 
> > Unfortunately that doesn't quite work.  At least in my build it
> > doesn't pick up .c files in the linker/arch directories.
> > 
> > > Index: lib/libkern/arch/arm64/ffs.c
> 
> I was certain I had checked this, but indeed it doesn't work.
> 
> The Makefile rules are generated from sys/conf/files:
> 
> ...
> file lib/libkern/arch/${MACHINE_ARCH}/ffs.S | lib/libkern/ffs.c
> ...
> 
> We could extend this by a third file.  Ugly.
> Alternatively I could go back to .S, sigh.

FWIW, .S would be perfectly fine with me.  This is writeonce, forget
forevere code so maintaining .S files isn't a huge burden.



Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-09 Thread Christian Weisgerber
Mark Kettenis:

> Unfortunately that doesn't quite work.  At least in my build it
> doesn't pick up .c files in the linker/arch directories.
> 
> > Index: lib/libkern/arch/arm64/ffs.c

I was certain I had checked this, but indeed it doesn't work.

The Makefile rules are generated from sys/conf/files:

...
file lib/libkern/arch/${MACHINE_ARCH}/ffs.S | lib/libkern/ffs.c
...

We could extend this by a third file.  Ugly.
Alternatively I could go back to .S, sigh.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-09 Thread Mark Kettenis
> From: Christian Weisgerber 
> Date: Tue, 9 Jun 2020 18:39:45 - (UTC)
> 
> Here are optimized ffs(3) implementations for
> * arm64  (superseding the earlier ffs.S)
> * powerpc
> * powerpc64
> 
> arm64 tested by myself, powerpc tested by cwen@.
> 
> OK?
> 
> (Some other archs fell through.  sparc64 specifies the popc instruction
> that can be used for this, but UltraSPARC doesn't implement it.
> alpha's ctlz was only added late with the CIX extension.)

Unfortunately that doesn't quite work.  At least in my build it
doesn't pick up .c files in the linker/arch directories.

> Index: lib/libkern/arch/arm64/ffs.c
> ===
> RCS file: lib/libkern/arch/arm64/ffs.c
> diff -N lib/libkern/arch/arm64/ffs.c
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libkern/arch/arm64/ffs.c  9 Jun 2020 18:09:11 -
> @@ -0,0 +1,12 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> +
> +int ffs(int x)
> +{
> + x = x & -x;
> + __asm volatile("clz %w0, %w0" : "+r" (x));
> + return (32 - x);
> +}
> Index: lib/libkern/arch/powerpc/ffs.c
> ===
> RCS file: lib/libkern/arch/powerpc/ffs.c
> diff -N lib/libkern/arch/powerpc/ffs.c
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libkern/arch/powerpc/ffs.c9 Jun 2020 18:10:20 -
> @@ -0,0 +1,12 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> +
> +int ffs(int x)
> +{
> + x = x & -x;
> + __asm volatile("cntlzw %0, %0" : "+r" (x));
> + return (32 - x);
> +}
> Index: lib/libkern/arch/powerpc64/ffs.c
> ===
> RCS file: lib/libkern/arch/powerpc64/ffs.c
> diff -N lib/libkern/arch/powerpc64/ffs.c
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libkern/arch/powerpc64/ffs.c  9 Jun 2020 18:10:49 -
> @@ -0,0 +1,12 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> +
> +int ffs(int x)
> +{
> + x = x & -x;
> + __asm volatile("cntlzw %0, %0" : "+r" (x));
> + return (32 - x);
> +}
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de
> 
> 



Re: libkern: ffs() for arm64, powerpc, powerpc64

2020-06-09 Thread Christian Weisgerber
On 2020-06-09, Christian Weisgerber  wrote:

> Here are optimized ffs(3) implementations for
> * arm64  (superseding the earlier ffs.S)
> * powerpc
> * powerpc64

> +int ffs(int x)

Oops, I'm going to change that to

int
ffs(int x)

before commit.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: libkern: ffs() for arm64

2020-06-08 Thread Joerg Sonnenberger
On Mon, Jun 08, 2020 at 11:16:59PM +0200, Christian Weisgerber wrote:
> I blame dlg@ for making me scrutinize the POWER7 instruction set,
> which led me to the clz instruction, which led me to ffs().  I
> wanted to add this to libc, but then realized the futility because
> the compiler already inlines its optimized copy of ffs().  However,
> this optimization is disabled for the kernel with -ffreestanding.

int ffs(int x) {
 return __builtin_ffs(x);
}

doesn't work?

Joerg



Re: libkern: ffs() for arm64

2020-06-08 Thread Theo de Raadt
Either works for me.

Honestly

- either one is hard to understand without referring to the
  docs (and then realizing power numbers bits backwards). 
- It is simple enough it has to work or there will be chaos

and I'm not that afraid of more poeple seeing RETGUARD stubs :) 

Christian Weisgerber  wrote:

> On 2020-06-08, Christian Weisgerber  wrote:
> 
> > More archs to come...
> 
> Style question:
> Since this mostly comes down to embedding a single special instruction
> in between normal C operations, I wonder whether I should just do .c
> with asm() instead of .S, and leave all the boilerplate to the
> compiler?  It would save me from reading up on calling conventions,
> more assembly syntax, etc.
> 
> E.g.:
> 
> int ffs(int x)
> {
> x = x & -x;
> __asm volatile("clz %0, %0" : "=r" (x));
> return (32 - x);
> }
> 
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de
> 



Re: libkern: ffs() for arm64

2020-06-08 Thread Todd C . Miller
On Mon, 08 Jun 2020 22:35:42 -, Christian Weisgerber wrote:

> Style question:
> Since this mostly comes down to embedding a single special instruction
> in between normal C operations, I wonder whether I should just do .c
> with asm() instead of .S, and leave all the boilerplate to the
> compiler?  It would save me from reading up on calling conventions,
> more assembly syntax, etc.

I think using inline assembler is fine if it makes things easier.

 - todd



Re: libkern: ffs() for arm64

2020-06-08 Thread Christian Weisgerber
On 2020-06-08, Christian Weisgerber  wrote:

> More archs to come...

Style question:
Since this mostly comes down to embedding a single special instruction
in between normal C operations, I wonder whether I should just do .c
with asm() instead of .S, and leave all the boilerplate to the
compiler?  It would save me from reading up on calling conventions,
more assembly syntax, etc.

E.g.:

int ffs(int x)
{
x = x & -x;
__asm volatile("clz %0, %0" : "=r" (x));
return (32 - x);
}

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



libkern: ffs() for arm64

2020-06-08 Thread Christian Weisgerber
Here is an optimized kernel ffs(3) for arm64.

I blame dlg@ for making me scrutinize the POWER7 instruction set,
which led me to the clz instruction, which led me to ffs().  I
wanted to add this to libc, but then realized the futility because
the compiler already inlines its optimized copy of ffs().  However,
this optimization is disabled for the kernel with -ffreestanding.

Honestly, I'm not sure I can claim to have written this.  The elegant
version below is adapted from clang output, because the compiler
is smarter than I am.

More archs to come...

Comments?  OK?

Index: lib/libkern/arch/arm64/ffs.S
===
RCS file: lib/libkern/arch/arm64/ffs.S
diff -N lib/libkern/arch/arm64/ffs.S
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libkern/arch/arm64/ffs.S8 Jun 2020 20:35:01 -
@@ -0,0 +1,17 @@
+/* $OpenBSD$ */
+/*
+ * Written by Christian Weisgerber .
+ * Public domain.
+ */
+
+#include 
+
+ENTRY(ffs)
+   RETGUARD_SETUP(ffs, x15)
+   rbitw1, w0
+   clz w1, w1
+   cmp w0, #0
+   csinc   w0, wzr, w1, eq
+   RETGUARD_CHECK(ffs, x15)
+   ret
+END(ffs)
-- 
Christian "naddy" Weisgerber  na...@mips.inka.de