Re: macppc kernel and clang

2020-03-30 Thread George Koehler
On Mon, 30 Mar 2020 22:39:48 +0200 (CEST)
Mark Kettenis  wrote:

> P.S. Userland seems to be in good shape as well.  I built and rebuilt
>  the world with clang.  That was on a kernel built with gcc, so
>  I'm repeating that now with a kernel built with clang.

That's good.  My iMac G3 is too slow to build the base system, so it
has been running a macppc snapshot where I rebuilt kernel, libc, and
a few other things with clang.

I know that clang can't build bsd.rd unless I use crunchgen -M;
see my mails of 17 and 18 Mar.

I recently learned that libunwind doesn't work on my G3.  I built
(with base-clang) a small program to throw a C++ exception; it crashes
at an illegal instruction because libunwind uses altivec and my G3
doesn't have altivec.  (G4 and G5 cpus probably have altivec.)  I can
run clang; LLVM and clang almost never throw C++ exceptions.

I also noticed that some .S files in libc give a clang warning, but
I didn't find why.

--George



Re: macppc kernel and clang

2020-03-30 Thread Mark Kettenis
> Date: Sun, 29 Mar 2020 19:59:02 -0400
> From: George Koehler 
> 
> Here is a new diff for macppc's ofw_stack() problem, without using
> __attribute__((noinline)).  I use this diff to build and run a macppc
> kernel with clang.  It also works with gcc.
> 
> The kernel did 3 steps to prepare an Open Firmware call:
>   1. turn off interrupts (EE and RI in msr)
>   2. move the stack pointer %r1 to firmstk
>   3. switch to Open Firmware's pmap?
> 
> I don't understand these steps, but I tried to preserve all 3 steps as
> I shuffled the code.  The diff doesn't touch step 3.
> 
> The problem was at step 2: ofw_stack() copied the caller's stack frame
> to firmstk, and changed the caller's return address to ofw_back (which
> will restore the old %r1 and msr).  If clang inlines the caller into
> another function, then ofw_back would run too late.  I move step 2
> into openfirmware(), so there is no more copying a stack frame nor
> hijacking a return address.   (I claim that firmstk+NBPG-16 is a
> multiple of 16; that mtctr,bctrl is more idiomatic than mtlr,blrl.)
> 
> ofw_stack() becomes s = ofw_msr() and only does step 1, turning off EE
> and RI in msr.  ppc_mtmsr(s) restores the saved msr.  I don't use
> intr_disable() because it turns off only EE, not RI.  I changed
> OF_call_method*() to turn off EE (external interrupts) before they
> touch their static args.  Some functions, like OF_boot() and
> OF_quiesce(), seem unused, so I can't know if my changes are correct.
> 
> To build a kernel with clang, I do
> # make CC=clang COMPILER_VERSION=clang
> 
> Is this OK to commit?  Would it be better to use intr_disable() in
> OF_*() and turn off RI in ofwreal.S fwentry?

I think we could actually do everything in openfirm().  But lets worry
about that later.  This works and gets us moving forward.

ok kettenis@

P.S. Userland seems to be in good shape as well.  I built and rebuilt
 the world with clang.  That was on a kernel built with gcc, so
 I'm repeating that now with a kernel built with clang.

> Index: ofw_machdep.h
> ===
> RCS file: /cvs/src/sys/arch/macppc/macppc/ofw_machdep.h,v
> retrieving revision 1.9
> diff -u -p -r1.9 ofw_machdep.h
> --- ofw_machdep.h 7 Apr 2015 14:36:34 -   1.9
> +++ ofw_machdep.h 29 Mar 2020 16:16:27 -
> @@ -26,6 +26,9 @@
>   *
>   */
>  
> +#include 
> +#include 
> +
>  extern int cons_backlight_available;
>  
>  void ofwconprobe(void);
> @@ -49,3 +52,12 @@ void of_setbrightness(int);
>  void of_setcolors(const uint8_t *, unsigned int, unsigned int);
>  
>  void OF_quiesce(void);
> +
> +static inline uint32_t
> +ofw_msr(void)
> +{
> + uint32_t s = ppc_mfmsr();
> +
> + ppc_mtmsr(s & ~(PSL_EE|PSL_RI)); /* turn off interrupts */
> + return s;
> +}
> Index: ofwreal.S
> ===
> RCS file: /cvs/src/sys/arch/macppc/macppc/ofwreal.S,v
> retrieving revision 1.5
> diff -u -p -r1.5 ofwreal.S
> --- ofwreal.S 3 Sep 2019 14:37:22 -   1.5
> +++ ofwreal.S 29 Mar 2020 16:16:27 -
> @@ -355,96 +355,32 @@ _ENTRY(_C_LABEL(fwentry))
>   addi%r1,%r1,16
>   blr
>  
> +.lcomm   firmstk,NBPG,16
> +.comm_C_LABEL(OF_buf),NBPG
> +
>  /*
>   * OpenFirmware entry point
> + *
> + * Note: caller has to set the machine state register (msr)
> + * to be correct for OpenFirmware.
>   */
>  _ENTRY(_C_LABEL(openfirmware))
> - stwu%r1,-16(%r1)
> - mflr%r0 /* save return address */
> - stw %r0,20(%r1)
> + mflr%r0
> + stw %r0,4(%r1)  /* save return address */
> +
> + /* switch to OpenFirmware real mode stack */
> + lis %r7,firmstk+NBPG-16@ha
> + addi%r7,%r7,firmstk+NBPG-16@l
> + stw %r1,0(%r7)
> + mr  %r1,%r7
>  
>   lis %r4,fwcall@ha
>   lwz %r4,fwcall@l(%r4)
>  
> - mtlr%r4
> - blrl
> -
> - lwz %r0,20(%r1)
> - mtlr%r0
> - lwz %r1,0(%r1)
> - blr
> -
> -/*
> - * Switch to/from OpenFirmware real mode stack
> - *
> - * Note: has to be called as the very first thing in OpenFirmware interface 
> routines.
> - * E.g.:
> - * int
> - * OF_xxx(arg1, arg2)
> - * type arg1, arg2;
> - * {
> - *   static struct {
> - *   char *name;
> - *   int nargs;
> - *   int nreturns;
> - *   char *method;
> - *   int arg1;
> - *   int arg2;
> - *   int ret;
> - *   } args = {
> - *   "xxx",
> - *   2,
> - * 

Re: macppc kernel and clang

2020-03-29 Thread George Koehler
Here is a new diff for macppc's ofw_stack() problem, without using
__attribute__((noinline)).  I use this diff to build and run a macppc
kernel with clang.  It also works with gcc.

The kernel did 3 steps to prepare an Open Firmware call:
  1. turn off interrupts (EE and RI in msr)
  2. move the stack pointer %r1 to firmstk
  3. switch to Open Firmware's pmap?

I don't understand these steps, but I tried to preserve all 3 steps as
I shuffled the code.  The diff doesn't touch step 3.

The problem was at step 2: ofw_stack() copied the caller's stack frame
to firmstk, and changed the caller's return address to ofw_back (which
will restore the old %r1 and msr).  If clang inlines the caller into
another function, then ofw_back would run too late.  I move step 2
into openfirmware(), so there is no more copying a stack frame nor
hijacking a return address.   (I claim that firmstk+NBPG-16 is a
multiple of 16; that mtctr,bctrl is more idiomatic than mtlr,blrl.)

ofw_stack() becomes s = ofw_msr() and only does step 1, turning off EE
and RI in msr.  ppc_mtmsr(s) restores the saved msr.  I don't use
intr_disable() because it turns off only EE, not RI.  I changed
OF_call_method*() to turn off EE (external interrupts) before they
touch their static args.  Some functions, like OF_boot() and
OF_quiesce(), seem unused, so I can't know if my changes are correct.

To build a kernel with clang, I do
# make CC=clang COMPILER_VERSION=clang

Is this OK to commit?  Would it be better to use intr_disable() in
OF_*() and turn off RI in ofwreal.S fwentry?

Index: ofw_machdep.h
===
RCS file: /cvs/src/sys/arch/macppc/macppc/ofw_machdep.h,v
retrieving revision 1.9
diff -u -p -r1.9 ofw_machdep.h
--- ofw_machdep.h   7 Apr 2015 14:36:34 -   1.9
+++ ofw_machdep.h   29 Mar 2020 16:16:27 -
@@ -26,6 +26,9 @@
  *
  */
 
+#include 
+#include 
+
 extern int cons_backlight_available;
 
 void ofwconprobe(void);
@@ -49,3 +52,12 @@ void of_setbrightness(int);
 void of_setcolors(const uint8_t *, unsigned int, unsigned int);
 
 void OF_quiesce(void);
+
+static inline uint32_t
+ofw_msr(void)
+{
+   uint32_t s = ppc_mfmsr();
+
+   ppc_mtmsr(s & ~(PSL_EE|PSL_RI)); /* turn off interrupts */
+   return s;
+}
Index: ofwreal.S
===
RCS file: /cvs/src/sys/arch/macppc/macppc/ofwreal.S,v
retrieving revision 1.5
diff -u -p -r1.5 ofwreal.S
--- ofwreal.S   3 Sep 2019 14:37:22 -   1.5
+++ ofwreal.S   29 Mar 2020 16:16:27 -
@@ -355,96 +355,32 @@ _ENTRY(_C_LABEL(fwentry))
addi%r1,%r1,16
blr
 
+.lcomm firmstk,NBPG,16
+.comm  _C_LABEL(OF_buf),NBPG
+
 /*
  * OpenFirmware entry point
+ *
+ * Note: caller has to set the machine state register (msr)
+ * to be correct for OpenFirmware.
  */
 _ENTRY(_C_LABEL(openfirmware))
-   stwu%r1,-16(%r1)
-   mflr%r0 /* save return address */
-   stw %r0,20(%r1)
+   mflr%r0
+   stw %r0,4(%r1)  /* save return address */
+
+   /* switch to OpenFirmware real mode stack */
+   lis %r7,firmstk+NBPG-16@ha
+   addi%r7,%r7,firmstk+NBPG-16@l
+   stw %r1,0(%r7)
+   mr  %r1,%r7
 
lis %r4,fwcall@ha
lwz %r4,fwcall@l(%r4)
 
-   mtlr%r4
-   blrl
-
-   lwz %r0,20(%r1)
-   mtlr%r0
-   lwz %r1,0(%r1)
-   blr
-
-/*
- * Switch to/from OpenFirmware real mode stack
- *
- * Note: has to be called as the very first thing in OpenFirmware interface 
routines.
- * E.g.:
- * int
- * OF_xxx(arg1, arg2)
- * type arg1, arg2;
- * {
- * static struct {
- * char *name;
- * int nargs;
- * int nreturns;
- * char *method;
- * int arg1;
- * int arg2;
- * int ret;
- * } args = {
- * "xxx",
- * 2,
- * 1,
- * };
- *
- * ofw_stack();
- * args.arg1 = arg1;
- * args.arg2 = arg2;
- * if (openfirmware() < 0)
- * return -1;
- * return args.ret;
- * }
- */
-.lcomm firmstk,NBPG,16
-.comm  _C_LABEL(OF_buf),NBPG
-
-_ENTRY(_C_LABEL(ofw_stack))
-   mfmsr   %r8 /* turn off interrupts */
-   andi.   %r0,%r8,~(PSL_EE|PSL_RI)@l
-   mtmsr   %r0
-   stw %r8,4(%r1)  /* abuse return address slot */
-
-   lwz %r5,0(%r1)  /* get length of stack frame */
-   subf%r5,%r1,%r5
-
-   lis %r7,firmstk+NBPG-8@ha
-   addi%r7,%r7,firmstk+NBPG-8@l
-   li  %r6,0xf
-   andc%r7,%r7,%r6
-   lis %r6,ofw_back@ha
-   addi%r6,%r6,ofw_back@l
-   subf%r4,%r5,%r7 /* make room for stack frame on new 
stack */
-   stwu%r1,-16(%r7)
-   stw %r6,4(%r7)  /* setup return pointer */
+   mtctr   %r4
+   bctrl
 
- 

Re: macppc kernel and clang

2020-03-18 Thread George Koehler
On Tue, 17 Mar 2020 13:23:28 -0400
George Koehler  wrote:

> clang -static -L.  -nopie -o instbin instbin.o dd.lo ...
> /usr/bin/ld: dd.lo(.text+0x14): R_PPC_PLTREL24 reloc against local symbol
> 
> ...
> 
> Passing -M to crunchgen(8), as we do on {longsoon,octeon,sgi}, might
> work around the problem, but I haven't tried it, and I don't know
> whether this is the only problem.

crunchgen -M worked and I got a bsd.rd from clang, while I had the
%L0 and noinline stuff in /sys.  (I didn't build a release.  I did
need to build a few other things before bsd.rd.)

Index: Makefile
===
RCS file: /cvs/src/distrib/macppc/ramdisk/Makefile,v
retrieving revision 1.46
diff -u -p -r1.46 Makefile
--- Makefile3 May 2019 20:03:58 -   1.46
+++ Makefile19 Mar 2020 03:12:47 -
@@ -55,7 +55,7 @@ mr.fs: instbin
makefs ${MRMAKEFSARGS} $@ $@.d
 
 instbin.mk instbin.cache instbin.c: instbin.conf
-   crunchgen -E -D ${.CURDIR}/../../.. -L ${DESTDIR}/usr/lib \
+   crunchgen -E -M -D ${.CURDIR}/../../.. -L ${DESTDIR}/usr/lib \
-c instbin.c -e instbin -m instbin.mk instbin.conf
 
 instbin: instbin.mk instbin.cache instbin.c



Re: macppc kernel and clang

2020-03-17 Thread George Koehler
On Mon, 16 Mar 2020 19:13:13 -0600
"Theo de Raadt"  wrote:

> How are the bootblocks faring?
> 
> And userland?

ofwboot with clang works for me.

I failed to make bsd.rd (in src/distrib/macppc/ramdisk) with clang,
but I don't remember exactly what I did, so I might have been making
it the wrong way.  My build failed at

clang -static -L.  -nopie -o instbin instbin.o dd.lo ...
/usr/bin/ld: dd.lo(.text+0x14): R_PPC_PLTREL24 reloc against local symbol

I was able to make bsd.rd with gcc.

"$CC -fno-pie -c something.c; objdump -dlr something.o" shows that
gcc -fno-pie uses R_PPC_REL24, but clang -fno-pie uses R_PPC_PLTREL24,
for calls to external functions.  The symbols would be global until
crunchgen(8) hides them by marking them local.  R_PPC_REL24 would
branch directly to a function, and R_PPC_PLTREL24 would go through
the PLT (for a function in a shared library).  The compiler can't
know whether the function is in a shared lib; the linker must decide.

clang -fno-pie on amd64 uses R_X86_64_PLT32, so I guess that clang
-fno-pie on powerpc isn't wrong to use R_PPC_PLTREL24.

Passing -M to crunchgen(8), as we do on {longsoon,octeon,sgi}, might
work around the problem, but I haven't tried it, and I don't know
whether this is the only problem.

--George



Re: macppc kernel and clang

2020-03-17 Thread Mark Kettenis
> From: "Theo de Raadt" 
> Date: Mon, 16 Mar 2020 19:13:13 -0600
> 
> ok deraadt
> 
> Next up, to figure out the right plan for ofw.
> 
> Thank you so much for figuring out these two details.
> 
> How are the bootblocks faring?

I believe somebody already tested the bootloader.  Doesn't hurt to try
again though.

> And userland?

base was in pretty good shape already and I'm rebuilding now that the
ABI fix is in.  Haven't tried xenocara yet.

> George Koehler  wrote:
> 
> > On Mon, 16 Mar 2020 12:54:52 +0100 (CET)
> > Mark Kettenis  wrote:
> > 
> > > I had a look at what NetBSD does, and they use '%L0' instead of
> > > '%0+1'.  As far as I can tell this works on both gcc and clang.  The
> > > diff below produces a working kernel when building with gcc.  Not yet
> > > in a position to test a clang-built kernel myself yet.  But if this
> > > produces a working kernel with clang as well, I'd prefer this diff
> > > instead of yours.
> > 
> > Yes, the clang kernel is working with your %L0 diff and the noinline
> > stuff.  I now prefer your %L0 diff, ok gkoehler@
> > 
> > "mftb %L0" becomes "mftb ${0:L}" in LLVM IR (clang -S -emit-llvm),
> > then LLVM handles the 'L' in PPCAsmPrinter::PrintAsmOperand in
> > /usr/src/gnu/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
> > 
> > > Index: arch/powerpc/include/cpu.h
> > > ===
> > > RCS file: /cvs/src/sys/arch/powerpc/include/cpu.h,v
> > > retrieving revision 1.65
> > > diff -u -p -r1.65 cpu.h
> > > --- arch/powerpc/include/cpu.h23 Mar 2019 05:27:53 -  1.65
> > > +++ arch/powerpc/include/cpu.h16 Mar 2020 11:30:42 -
> > > @@ -336,7 +336,7 @@ ppc_mftb(void)
> > >   u_long scratch;
> > >   u_int64_t tb;
> > >  
> > > - __asm volatile ("1: mftbu %0; mftb %0+1; mftbu %1;"
> > > + __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
> > >   " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
> > >   return tb;
> > >  }
> > 
> > 
> > -- 
> > George Koehler 
> > 
> 



Re: macppc kernel and clang

2020-03-16 Thread Theo de Raadt
ok deraadt

Next up, to figure out the right plan for ofw.

Thank you so much for figuring out these two details.

How are the bootblocks faring?

And userland?

George Koehler  wrote:

> On Mon, 16 Mar 2020 12:54:52 +0100 (CET)
> Mark Kettenis  wrote:
> 
> > I had a look at what NetBSD does, and they use '%L0' instead of
> > '%0+1'.  As far as I can tell this works on both gcc and clang.  The
> > diff below produces a working kernel when building with gcc.  Not yet
> > in a position to test a clang-built kernel myself yet.  But if this
> > produces a working kernel with clang as well, I'd prefer this diff
> > instead of yours.
> 
> Yes, the clang kernel is working with your %L0 diff and the noinline
> stuff.  I now prefer your %L0 diff, ok gkoehler@
> 
> "mftb %L0" becomes "mftb ${0:L}" in LLVM IR (clang -S -emit-llvm),
> then LLVM handles the 'L' in PPCAsmPrinter::PrintAsmOperand in
> /usr/src/gnu/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
> 
> > Index: arch/powerpc/include/cpu.h
> > ===
> > RCS file: /cvs/src/sys/arch/powerpc/include/cpu.h,v
> > retrieving revision 1.65
> > diff -u -p -r1.65 cpu.h
> > --- arch/powerpc/include/cpu.h  23 Mar 2019 05:27:53 -  1.65
> > +++ arch/powerpc/include/cpu.h  16 Mar 2020 11:30:42 -
> > @@ -336,7 +336,7 @@ ppc_mftb(void)
> > u_long scratch;
> > u_int64_t tb;
> >  
> > -   __asm volatile ("1: mftbu %0; mftb %0+1; mftbu %1;"
> > +   __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
> > " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
> > return tb;
> >  }
> 
> 
> -- 
> George Koehler 
> 



Re: macppc kernel and clang

2020-03-16 Thread George Koehler
On Mon, 16 Mar 2020 12:54:52 +0100 (CET)
Mark Kettenis  wrote:

> I had a look at what NetBSD does, and they use '%L0' instead of
> '%0+1'.  As far as I can tell this works on both gcc and clang.  The
> diff below produces a working kernel when building with gcc.  Not yet
> in a position to test a clang-built kernel myself yet.  But if this
> produces a working kernel with clang as well, I'd prefer this diff
> instead of yours.

Yes, the clang kernel is working with your %L0 diff and the noinline
stuff.  I now prefer your %L0 diff, ok gkoehler@

"mftb %L0" becomes "mftb ${0:L}" in LLVM IR (clang -S -emit-llvm),
then LLVM handles the 'L' in PPCAsmPrinter::PrintAsmOperand in
/usr/src/gnu/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp

> Index: arch/powerpc/include/cpu.h
> ===
> RCS file: /cvs/src/sys/arch/powerpc/include/cpu.h,v
> retrieving revision 1.65
> diff -u -p -r1.65 cpu.h
> --- arch/powerpc/include/cpu.h23 Mar 2019 05:27:53 -  1.65
> +++ arch/powerpc/include/cpu.h16 Mar 2020 11:30:42 -
> @@ -336,7 +336,7 @@ ppc_mftb(void)
>   u_long scratch;
>   u_int64_t tb;
>  
> - __asm volatile ("1: mftbu %0; mftb %0+1; mftbu %1;"
> + __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
>   " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
>   return tb;
>  }


-- 
George Koehler 



Re: macppc kernel and clang

2020-03-16 Thread Mark Kettenis
> Date: Sat, 14 Mar 2020 23:33:37 -0400
> From: George Koehler 
> 
> Hello tech@ list!
> 
> With this diff, it becomes possible to build macppc kernel with
> base-clang.  The default compiler for macppc is base-gcc.
> 
> rgc  mailed the ppc@ list to report problems with
> clang kernel, and got a working kernel with clang -Oz (to stop clang
> inlining some functions in openfirm.c) and my modified ppc_mftb().
> In this diff, I use __attribute__((noinline)) so I don't need -Oz.
> I built my kernel with
> 
> # clang CC=clang COMPILER_VERSION=clang
> 
> because someone has already prepared the Makefile for macppc to pass
> different flags when COMPILER_VERSION is clang.  I booted my clang
> kernel and it worked well enough to build another kernel with the
> same diff, but with gcc.
> 
> In the second half of the diff, I change ppc_mftb(), because
> "mftb %0+1" doesn't always work in clang.  For example, clang can
> put "=r"(tb) in register pair r27:r29, but 27+1 isn't 29.  I don't
> know the syntax for the 2nd register of a pair, so I instead split
> "=r"(tb) into 2 variables.  We use ppc_mftb() as timecounter.
> 
> (gcc and clang emit different machine code for "mftbu" and "mftb", but
> both forms work for me.  "objdump -dlr obj/clock.o" ppc_mftb() would
> show "mftb" from gcc or "mfspr" from clang.  Power ISA 2.03 says that
> mfspr time base can't work on 601 or POWER3.  A few Old World Mac
> models had PowerPC 601, but we only run on New World models.)
> 
> Is the ppc_mftb() change by itself OK to commit?

I had a look at what NetBSD does, and they use '%L0' instead of
'%0+1'.  As far as I can tell this works on both gcc and clang.  The
diff below produces a working kernel when building with gcc.  Not yet
in a position to test a clang-built kernel myself yet.  But if this
produces a working kernel with clang as well, I'd prefer this diff
instead of yours.

> I am less sure about the first half of the diff, the noinline part.
> I don't like the design of this code.  These functions call ofw_stack()
> like
> 
> noinline int
> OF_peer(int phandle)
> {
>   static struct ... args = ...;
> 
> ofw_stack();
> args.phandle = phandle;
> if (openfirmware() == -1)
> return 0;
> return args.sibling;
> }
> 
> but ofw_stack() is assembly code that changes the cpu's msr, moves the
> stack pointer %r1 to a different stack, copies the stack frame of
> OF_peer() to this new stack, and hijacks the saved return address so
> that, when OF_peer() returns, it switches back to the old stack and
> restores the msr.  If clang inlines a function like OF_peer(), it no
> longer has its own stack frame, so the return-hijack stops working.
> 
> We don't have retguard on powerpc, but if OF_peer() would use
> retguard to protect its return address, then the return-hijack might
> become impossible.
> 
> Perhaps the code should look like
> 
>   msr = do_something_to_msr();
>   args.phandle = phandle;
>   if (openfirmware() == -1)
>   ret = 0;
>   else
>   ret = args.sibling;
>   ppc_mtmsr(msr);
>   return ret;
> 
> and openfirmware() should do the stack-switching, but I have not yet
> tried to make such a change.  (I suspect that we need the msr to
> disable interrupts and protect the static args.)

Interrupts have to be blocked while we're not on the normal kernel stack.

I'm still working out for myself how this all works.  NetBSD's code
stull uses ofw_stack(), so looking at their code doesn't help us.
FreeBSD seems to solve this in quite a different way, but tries to
support many different variations of firmware.

Cheers,

Mark


Index: arch/powerpc/include/cpu.h
===
RCS file: /cvs/src/sys/arch/powerpc/include/cpu.h,v
retrieving revision 1.65
diff -u -p -r1.65 cpu.h
--- arch/powerpc/include/cpu.h  23 Mar 2019 05:27:53 -  1.65
+++ arch/powerpc/include/cpu.h  16 Mar 2020 11:30:42 -
@@ -336,7 +336,7 @@ ppc_mftb(void)
u_long scratch;
u_int64_t tb;
 
-   __asm volatile ("1: mftbu %0; mftb %0+1; mftbu %1;"
+   __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
" cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
return tb;
 }



Re: macppc kernel and clang

2020-03-15 Thread rgc
On Sat, Mar 14, 2020 at 11:33:37PM -0400, George Koehler wrote:
> After the diff, I put my dmesg; my iMac G3 isn't GENERIC.MP and
> doesn't have radeondrm.  --George
> 

i've been using your ppc_mftb() patch on both clang ang gcc compiled
kernel ... for SP though and there seems to be no issues

attached is Powerbook G4 w/ radeondrm ... using the patches and the
steps you provided (thus this kernel is not -Oz)


[ using 1137228 bytes of bsd ELF symbol table ]
console out [ATY,Jasper_A] console in [keyboard], using USB
using parent ATY,JasperParent:: memaddr b800, size 800 : consaddr 
b8008000 : ioaddr b002, size 2: width 1440 linebytes 1536 height 960 
depth 8
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2020 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 6.6-current (GENERIC) #27: Sun Mar 15 16:50:21 JST 2020

r...@apbg4.my.domain:/home/rgc/work/clang-kern/src/sys/arch/macppc/compile/GENERIC
real mem = 1073741824 (1024MB)
avail mem = 1024315392 (976MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: model PowerBook5,8
cpu0 at mainbus0: 7447A (Revision 0x105): 1666 MHz: 512KB L2 cache
mem0 at mainbus0
spdmem0 at mem0: 1GB DDR2 SDRAM non-parity PC2-5300CL5 SO-DIMM
memc0 at mainbus0: uni-n rev 0xd9
"hw-clock" at memc0 not configured
kiic0 at memc0 offset 0xf8001000
iic0 at kiic0
adt0 at iic0 addr 0x2e: adt7467 rev 0x71
lmtemp0 at iic0 addr 0x49: ds1775
mpcpcibr0 at mainbus0 pci: uni-north
pci0 at mpcpcibr0 bus 0
pchb0 at pci0 dev 11 function 0 "Apple Intrepid 2 AGP" rev 0x00
agp at pchb0 not configured
radeondrm0 at pci0 dev 16 function 0 "ATI Radeon Mobility M10" rev 0x00
drm0 at radeondrm0
radeondrm0: irq 48
mpcpcibr1 at mainbus0 pci: uni-north
pci1 at mpcpcibr1 bus 0
macobio0 at pci1 dev 23 function 0 "Apple Intrepid" rev 0x00
openpic0 at macobio0 offset 0x4: version 0x4614 feature 3f0302 LE
macgpio0 at macobio0 offset 0x50
"modem-reset" at macgpio0 offset 0x1d not configured
"accelerometer-1" at macgpio0 offset 0x13 not configured
"accelerometer-2" at macgpio0 offset 0x14 not configured
"amp-mute" at macgpio0 offset 0x20 not configured
"dig-hw-reset" at macgpio0 offset 0x26 not configured
"codec-error-irq" at macgpio0 offset 0x18 not configured
"combo-in-sense" at macgpio0 offset 0xb not configured
"combo-out-sense" at macgpio0 offset 0x12 not configured
"codec-input-data-mux" at macgpio0 offset 0x23 not configured
"codec-clock-mux" at macgpio0 offset 0x29 not configured
"hw-reset" at macgpio0 offset 0x1a not configured
"linein-detect" at macgpio0 offset 0xc not configured
"lineout-detect" at macgpio0 offset 0x17 not configured
"lineout-mute" at macgpio0 offset 0x1f not configured
dfs0 at macgpio0 offset 0x1b: speeds: 1666, 833 MHz
macgpio1 at macgpio0 offset 0x9: irq 47
"programmer-switch" at macgpio0 offset 0x11 not configured
"escc-legacy" at macobio0 offset 0x12000 not configured
zs0 at macobio0 offset 0x13000: irq 22,23
zstty0 at zs0 channel 0
zstty1 at zs0 channel 1
onyx0 at macobio0 offset 0x0: irq 30,1,2
"timer" at macobio0 offset 0x15000 not configured
adb0 at macobio0 offset 0x16000
apm0 at adb0: battery flags 0x7, 16% charged
piic0 at adb0
iic1 at piic0
"backlight" at macobio0 offset 0xf300 not configured
kiic1 at macobio0 offset 0x18000
iic2 at kiic1
"pcm3052" at iic2 addr 0x46 not configured
"cs8416" at iic2 addr 0x10 not configured
audio0 at onyx0
bwi0 at pci1 dev 17 function 0 "Broadcom BCM4318" rev 0x02: irq 52, address 
00:14:51:7e:3b:4c
cbb0 at pci1 dev 20 function 0 "TI PCI1510 CardBus" rev 0x00: irq 53
ohci0 at pci1 dev 21 function 0 "NEC USB" rev 0x43: irq 54, version 1.0
ohci1 at pci1 dev 21 function 1 "NEC USB" rev 0x43: irq 54, version 1.0
ehci0 at pci1 dev 21 function 2 "NEC USB" rev 0x04: irq 54
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "NEC EHCI root hub" rev 2.00/1.00 
addr 1
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 1 device 0 cacheline 0x8, lattimer 0x20
pcmcia0 at cardslot0
usb1 at ohci0: USB revision 1.0
uhub1 at usb1 configuration 1 interface 0 "NEC OHCI root hub" rev 1.00/1.00 
addr 1
usb2 at ohci1: USB revision 1.0
uhub2 at usb2 configuration 1 interface 0 "NEC OHCI root hub" rev 1.00/1.00 
addr 1
mpcpcibr2 at mainbus0 pci: uni-north
pci2 at mpcpcibr2 bus 0
kauaiata0 at pci2 dev 13 function 0 "Apple Intrepid 2 ATA" rev 0x00
wdc0 at kauaiata0 irq 39: DMA
wd0 at wdc0 channel 0 drive 0: 
wd0: 16-sector PIO, LBA48, 244198MB, 500118192 sectors
atapiscsi0 at wdc0 channel 0 drive 1
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0:  removable
wd0(wdc0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 4
cd0(wdc0:0:1): using PIO mode 4, DMA mode 2, Ultra-DMA mode 4
"Apple Intrepid 2 FireWire" rev 0x00 at pci2 dev 14 function 0 not configured
gem0 at pci2 dev 15 function 0 "Apple Intrepid 2 GMAC" rev 0x00: irq 41, 
address 00:14:51:1d:d1:5e
brgphy0 at gem0 phy 0: 

macppc kernel and clang

2020-03-14 Thread George Koehler
Hello tech@ list!

With this diff, it becomes possible to build macppc kernel with
base-clang.  The default compiler for macppc is base-gcc.

rgc  mailed the ppc@ list to report problems with
clang kernel, and got a working kernel with clang -Oz (to stop clang
inlining some functions in openfirm.c) and my modified ppc_mftb().
In this diff, I use __attribute__((noinline)) so I don't need -Oz.
I built my kernel with

# clang CC=clang COMPILER_VERSION=clang

because someone has already prepared the Makefile for macppc to pass
different flags when COMPILER_VERSION is clang.  I booted my clang
kernel and it worked well enough to build another kernel with the
same diff, but with gcc.

In the second half of the diff, I change ppc_mftb(), because
"mftb %0+1" doesn't always work in clang.  For example, clang can
put "=r"(tb) in register pair r27:r29, but 27+1 isn't 29.  I don't
know the syntax for the 2nd register of a pair, so I instead split
"=r"(tb) into 2 variables.  We use ppc_mftb() as timecounter.

(gcc and clang emit different machine code for "mftbu" and "mftb", but
both forms work for me.  "objdump -dlr obj/clock.o" ppc_mftb() would
show "mftb" from gcc or "mfspr" from clang.  Power ISA 2.03 says that
mfspr time base can't work on 601 or POWER3.  A few Old World Mac
models had PowerPC 601, but we only run on New World models.)

Is the ppc_mftb() change by itself OK to commit?

I am less sure about the first half of the diff, the noinline part.
I don't like the design of this code.  These functions call ofw_stack()
like

noinline int
OF_peer(int phandle)
{
static struct ... args = ...;

ofw_stack();
args.phandle = phandle;
if (openfirmware() == -1)
return 0;
return args.sibling;
}

but ofw_stack() is assembly code that changes the cpu's msr, moves the
stack pointer %r1 to a different stack, copies the stack frame of
OF_peer() to this new stack, and hijacks the saved return address so
that, when OF_peer() returns, it switches back to the old stack and
restores the msr.  If clang inlines a function like OF_peer(), it no
longer has its own stack frame, so the return-hijack stops working.

We don't have retguard on powerpc, but if OF_peer() would use
retguard to protect its return address, then the return-hijack might
become impossible.

Perhaps the code should look like

msr = do_something_to_msr();
args.phandle = phandle;
if (openfirmware() == -1)
ret = 0;
else
ret = args.sibling;
ppc_mtmsr(msr);
return ret;

and openfirmware() should do the stack-switching, but I have not yet
tried to make such a change.  (I suspect that we need the msr to
disable interrupts and protect the static args.)

After the diff, I put my dmesg; my iMac G3 isn't GENERIC.MP and
doesn't have radeondrm.  --George

Index: arch/macppc/macppc/openfirm.c
===
RCS file: /cvs/src/sys/arch/macppc/macppc/openfirm.c,v
retrieving revision 1.12
diff -u -p -r1.12 openfirm.c
--- arch/macppc/macppc/openfirm.c   3 Sep 2019 17:51:52 -   1.12
+++ arch/macppc/macppc/openfirm.c   11 Mar 2020 22:21:50 -
@@ -38,10 +38,13 @@
 
 #include 
 
+/* Callers of ofw_stack() must not be inline. */
+#define noinline __attribute__((noinline))
+
 extern void ofw_stack(void);
 extern void ofbcopy(const void *, void *, size_t);
 
-int
+noinline int
 OF_peer(int phandle)
 {
static struct {
@@ -63,7 +66,7 @@ OF_peer(int phandle)
return args.sibling;
 }
 
-int
+noinline int
 OF_child(int phandle)
 {
static struct {
@@ -85,7 +88,7 @@ OF_child(int phandle)
return args.child;
 }
 
-int
+noinline int
 OF_parent(int phandle)
 {
static struct {
@@ -107,7 +110,7 @@ OF_parent(int phandle)
return args.parent;
 }
 
-int
+noinline int
 OF_getproplen(int handle, char *prop)
 {
static struct {
@@ -131,7 +134,7 @@ OF_getproplen(int handle, char *prop)
return args.size;
 }
 
-int
+noinline int
 OF_getprop(int handle, char *prop, void *buf, int buflen)
 {
static struct {
@@ -163,7 +166,7 @@ OF_getprop(int handle, char *prop, void 
return args.size;
 }
 
-int
+noinline int
 OF_setprop(int handle, char *prop, const void *buf, int buflen)
 {
static struct {
@@ -194,7 +197,7 @@ OF_setprop(int handle, char *prop, const
return args.size;
 }
 
-int
+noinline int
 OF_nextprop(int handle, char *prop, void *nextprop)
 {
static struct {
@@ -221,7 +224,7 @@ OF_nextprop(int handle, char *prop, void
return args.flag;
 }
 
-int
+noinline int
 OF_interpret(char *cmd, int nreturns, ...)
 {
va_list ap;
@@ -258,7 +261,7 @@ OF_interpret(char *cmd, int nreturns, ..
 }
 
 
-int
+noinline int
 OF_finddevice(char *name)
 {
static struct {
@@ -281,7 +284,7 @@ OF_finddevice(char *name)
 }
 stat