Re: [PATCH] replace if with min

2021-07-19 Thread Segher Boessenkool
On Mon, Jul 19, 2021 at 06:12:05PM +0200, Christophe Leroy wrote:
> Salah Triki  a écrit :
> >Replace if with min in order to make code more clean.

> >--- a/drivers/crypto/nx/nx-842.c
> >+++ b/drivers/crypto/nx/nx-842.c
> >@@ -134,8 +134,7 @@ EXPORT_SYMBOL_GPL(nx842_crypto_exit);
> > static void check_constraints(struct nx842_constraints *c)
> > {
> > /* limit maximum, to always have enough bounce buffer to decompress 
> > */
> >-if (c->maximum > BOUNCE_BUFFER_SIZE)
> >-c->maximum = BOUNCE_BUFFER_SIZE;
> >+c->maximum = min(c->maximum, BOUNCE_BUFFER_SIZE);
> 
> For me the code is less clear with this change, and in addition it  
> slightly changes the behaviour. Before, the write was done only when  
> the value was changing. Now you rewrite the value always, even when it  
> doesn't change.

In both cases the compiler can decide to either write it more often than
strictly needed, depending on what it thinks best (and it usually has
better estimates than the programmer).  The behaviour is identical (and
the generated machine code is as well, in my testing).

The field name "maximum" is not the best choice, which makes the code
read a bit funny ("the min of max"), but the comment makes things pretty
clear.


Segher


Re: [PATCH v2 3/4] powerpc: wii.dts: Expose the OTP on this platform

2021-07-03 Thread Segher Boessenkool
On Fri, Jul 02, 2021 at 08:56:31AM +, Jonathan Neuschäfer wrote:
> On Thu, Jul 01, 2021 at 09:56:55PM +0200, Emmanuel Gil Peyrot wrote:
> > On Sat, Jun 26, 2021 at 11:34:01PM +, Jonathan Neuschäfer wrote:
> > > On Wed, May 19, 2021 at 11:50:43AM +0200, Emmanuel Gil Peyrot wrote:
> [...]
> > > > +   otp@d8001ec {
> > > > +   compatible = "nintendo,hollywood-otp";
> > > > +   reg = <0x0d8001ec 0x8>;
> > > 
> > > The OTP registers overlap with the previous node, control@d800100.
> > > Not sure what's the best way to structure the devicetree in this case,
> > > maybe something roughly like the following (untested, unverified):
> > [snip]
> > 
> > I couldn’t get this to work, but additionally it looks like it should
> > start 0x100 earlier and contain pic1@d800030 and gpio@d8000c0, given
> > https://wiibrew.org/wiki/Hardware/Hollywood_Registers
> > 
> > Would it make sense, for the time being, to reduce the size of this
> > control@d800100 device to the single register currently being used by
> > arch/powerpc/platforms/embedded6xx/wii.c (0xd800194, used to reboot the
> > system) and leave the refactor of restart + OTP + PIC + GPIO for a
> > future series?
> 
> Makes sense to me!

There is no benefit to pretending there is a "control" bus (there is no
such thing), it only gets in the way.


Segher


Re: [PATCH for 4.16 v7 02/11] powerpc: membarrier: Skip memory barrier in switch_mm()

2021-06-21 Thread Segher Boessenkool
Hi!

On Mon, Jun 21, 2021 at 04:11:04PM +0200, Christophe Leroy wrote:
> Le 19/06/2021 à 17:02, Segher Boessenkool a écrit :
> >The point of the twi in the I/O accessors was to make things easier to
> >debug if the accesses fail: for the twi insn to complete the load will
> >have to have completed as well.  On a correctly working system you never
> >should need this (until something fails ;-) )
> >
> >Without the twi you might need to enforce ordering in some cases still.
> >The twi is a very heavy hammer, but some of that that gives us is no
> >doubt actually needed.
> 
> Well, I've always been quite perplex about that. According to the 
> documentation of the 8xx, if a bus error or something happens on an I/O 
> access, the exception will be accounted on the instruction which does the 
> access. But based on the following function, I understand that some version 
> of powerpc do generate the trap on the instruction which was being executed 
> at the time the I/O access failed, not the instruction that does the access 
> itself ?

Trap instructions are never speculated (this may not be architectural,
but it is true on all existing implementations).  So the instructions
after the twi;isync will not execute until the twi itself has finished,
and that cannot happen before the preceding load has (because it uses
the loaded register).

Now, some I/O accesses can cause machine checks.  Machine checks are
asynchronous and can be hard to correlate to specific load insns, and
worse, you may not even have the address loaded from in architected
registers anymore.  Since I/O accesses often take *long*, tens or even
hundreds of cycles is not unusual, this can be a challenge.

To recover from machine checks you typically need special debug hardware
and/or software.  For the Apple machines those are not so easy to come
by.  This "twi after loads" thing made it pretty easy to figure out
where your code was going wrong.

And it isn't as slow as it may sound: typically you really need to have
the result of the load before you can go on do useful work anyway, and
loads from I/O are slow non-posted things.

> /*
>  * I/O accesses can cause machine checks on powermacs.
>  * Check if the NIP corresponds to the address of a sync
>  * instruction for which there is an entry in the exception
>  * table.
>  *  -- paulus.
>  */

I suspect this is from before the twi thing was added?

> It is not only the twi which bother's me in the I/O accessors but also the 
> sync/isync and stuff.
> 
> A write typically is
> 
>   sync
>   stw
> 
> A read is
> 
>   sync
>   lwz
>   twi
>   isync
> 
> Taking into account that HW ordering is garanteed by the fact that __iomem 
> is guarded,

Yes.  But machine checks are asynchronous :-)

> isn't the 'memory' clobber enough as a barrier ?

A "memory" clobber isn't a barrier of any kind.  "Compiler barriers" do
not exist.

The only thing such a clobber does is it tells the compiler that this
inline asm can access some memory, and we do not say at what address.
So the compiler cannot reorder this asm with other memory accesses.  It
has no other effects, no magical effects, and it is not comparable to
actual barrier instructions (that actually tell the hardware that some
certain ordering is required).

"Compiler barrier" is a harmful misnomer: language shapes thoughts,
using misleading names causes misguided thoughts.

Anyway :-)

The isync is simply to make sure the code after it does not start before
the code before it has completed.  The sync before I am not sure.


Segher


Re: [PATCH v2 2/9] powerpc: Add Microwatt device tree

2021-06-21 Thread Segher Boessenkool
On Sun, Jun 20, 2021 at 10:08:58PM +1000, Paul Mackerras wrote:
> On Sat, Jun 19, 2021 at 09:26:16AM -0500, Segher Boessenkool wrote:
> > On Fri, Jun 18, 2021 at 01:44:16PM +1000, Paul Mackerras wrote:
> > > Microwatt currently runs with MSR[HV] = 0,
> > 
> > That isn't compliant though?  If your implementation does not have LPAR
> > it must set MSR[HV]=1 always.
> 
> True - but if I actually do that, Linux starts trying to use hrfid
> (for example in masked_Hinterrupt), which Microwatt doesn't have.
> Something for Nick to fix. :)

That looks like it needs fixing, yes (it is hard to actually read).  But
one thing you can do to make this Just Work is to make hrfid do exactly
the same as rfid, i.e. decode hrfid (01000 10010) as rfid (0 10010).
That probably makes things run already, you don't even need to alias
to SPRs HSRRn (01001 1101n) to SRRn (0 1101n) :-)


Segher


Re: [PATCH for 4.16 v7 02/11] powerpc: membarrier: Skip memory barrier in switch_mm()

2021-06-19 Thread Segher Boessenkool
On Sat, Jun 19, 2021 at 11:35:34AM +0200, Christophe Leroy wrote:
> 
> 
> Le 18/06/2021 à 19:26, Mathieu Desnoyers a écrit :
> >- On Jun 18, 2021, at 1:13 PM, Christophe Leroy 
> >christophe.le...@csgroup.eu wrote:
> >[...]
> >>
> >>I don't understand all that complexity to just replace a simple
> >>'smp_mb__after_unlock_lock()'.
> >>
> >>#define smp_mb__after_unlock_lock() smp_mb()
> >>#define smp_mb()barrier()
> >># define barrier() __asm__ __volatile__("": : :"memory")
> >>
> >>
> >>Am I missing some subtility ?
> >
> >On powerpc CONFIG_SMP, smp_mb() is actually defined as:
> >
> >#define smp_mb()__smp_mb()
> >#define __smp_mb()  mb()
> >#define mb()   __asm__ __volatile__ ("sync" : : : "memory")
> >
> >So the original motivation here was to skip a "sync" instruction whenever
> >switching between threads which are part of the same process. But based on
> >recent discussions, I suspect my implementation may be inaccurately doing
> >so though.
> >
> 
> I see.
> 
> Then, if you think a 'sync' is a concern, shouldn't we try and remove the 
> forest of 'sync' in the I/O accessors ?
> 
> I can't really understand why we need all those 'sync' and 'isync' and 
> 'twi' around the accesses whereas I/O memory is usually mapped as 'Guarded' 
> so memory access ordering is already garantied.
> 
> I'm sure we'll save a lot with that.

The point of the twi in the I/O accessors was to make things easier to
debug if the accesses fail: for the twi insn to complete the load will
have to have completed as well.  On a correctly working system you never
should need this (until something fails ;-) )

Without the twi you might need to enforce ordering in some cases still.
The twi is a very heavy hammer, but some of that that gives us is no
doubt actually needed.


Segher


Re: [PATCH v2 0/9] powerpc: Add support for Microwatt soft-core

2021-06-19 Thread Segher Boessenkool
On Fri, Jun 18, 2021 at 01:42:53PM +1000, Paul Mackerras wrote:
> This series of patches adds support for the Microwatt soft-core.
> Microwatt is an open-source 64-bit Power ISA processor written in VHDL
> which targets medium-sized FPGAs such as the Xilinx Artix-7 or the
> Lattice ECP5.  Microwatt currently implements the scalar fixed plus
> floating-point subset of Power ISA v3.0B plus the radix MMU, but not
> logical partitioning (i.e. it does not have hypervisor mode or nested
> radix translation).

For the whole series:

Reviewed-by: Segher Boessenkool 

I didn't see anything in this revision that would prevent it from
being included upstream (that HV=1 thing should be fixed sooner rather
than later, but that is not a kernel problem).  Looks in great state :-)


Segher


Re: [PATCH v2 6/9] powerpc/microwatt: Add support for hardware random number generator

2021-06-19 Thread Segher Boessenkool
On Sat, Jun 19, 2021 at 01:08:51PM +1000, Nicholas Piggin wrote:
> Excerpts from Paul Mackerras's message of June 18, 2021 1:47 pm:
> > Microwatt's hardware RNG is accessed using the DARN instruction.
> 
> I think we're getting a platforms/book3s soon with the VAS patches, 
> might be a place to add the get_random_darn function.
> 
> Huh, DARN is unprivileged right?

It is, that's the whole point: to make it very very cheap for user
software it has to be an unprivileged instruction.


Segher


Re: [PATCH v2 2/9] powerpc: Add Microwatt device tree

2021-06-19 Thread Segher Boessenkool
On Fri, Jun 18, 2021 at 01:44:16PM +1000, Paul Mackerras wrote:
> Microwatt currently runs with MSR[HV] = 0,

That isn't compliant though?  If your implementation does not have LPAR
it must set MSR[HV]=1 always.


Segher


Re: [PATCH 11/11] powerpc/microwatt: Disable interrupts in boot wrapper main program

2021-06-17 Thread Segher Boessenkool
On Thu, Jun 17, 2021 at 11:40:23AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of June 17, 2021 9:37 am:
> > On Tue, Jun 15, 2021 at 09:05:27AM +1000, Paul Mackerras wrote:
> >> This ensures that we don't get a decrementer interrupt arriving before
> >> we have set up a handler for it.
> > 
> > Maybe add a comment saying this is setting MSR[EE]=0 for that?  Or do
> > other bits here matter as well?
> 
> Hmm, it actually clears MSR[RI] as well.
> 
> __hard_irq_disable() is what we want here, unless the MSR[RI] clearing 
> is required as well, in which case there is __hard_EE_RI_disable().

I don't think it matters if MSR[RI] is set or not here, nothing will try
to recover from an actual reboot I hope :-)


Segher


Re: [PATCH 11/11] powerpc/microwatt: Disable interrupts in boot wrapper main program

2021-06-16 Thread Segher Boessenkool
On Tue, Jun 15, 2021 at 09:05:27AM +1000, Paul Mackerras wrote:
> This ensures that we don't get a decrementer interrupt arriving before
> we have set up a handler for it.

Maybe add a comment saying this is setting MSR[EE]=0 for that?  Or do
other bits here matter as well?


Segher


Re: [PATCH 06/11] powerpc: microwatt: Use standard 16550 UART for console

2021-06-16 Thread Segher Boessenkool
On Tue, Jun 15, 2021 at 09:01:27AM +1000, Paul Mackerras wrote:
> This adds support to the Microwatt platform to use the standard
> 1655-style UART which available in the standalone Microwatt FPGA.

16550... 1655 is a DAC apparently :-)


Segher


Re: [PATCH 01/11] powerpc: Add Microwatt platform

2021-06-16 Thread Segher Boessenkool
Hi Paul,

On Tue, Jun 15, 2021 at 08:57:43AM +1000, Paul Mackerras wrote:
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -422,7 +422,7 @@ config HUGETLB_PAGE_SIZE_VARIABLE
>  
>  config MATH_EMULATION
>   bool "Math emulation"
> - depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE
> + depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE || PPC_MICROWATT
>   select PPC_FPU_REGS

Why do you need this / want this, since you have FP hardware?


Segher


Re: [PATCH v4 1/2] module: add elf_check_module_arch for module specific elf arch checks

2021-06-15 Thread Segher Boessenkool
On Tue, Jun 15, 2021 at 03:41:00PM +0200, Jessica Yu wrote:
> +++ Segher Boessenkool [15/06/21 07:50 -0500]:
> >On Tue, Jun 15, 2021 at 02:17:40PM +0200, Jessica Yu wrote:
> >>+int __weak elf_check_module_arch(Elf_Ehdr *hdr)
> >>+{
> >>+   return 1;
> >>+}
> >
> >But is this a good idea?  It isn't useful to be able to attempt to load
> >a module not compiled for your architecture, and it increases the attack
> >surface tremendously.  These checks are one of the few things that can
> >*not* be weak symbols, imo.
> 
> Hm, could you please elaborate a bit more? This patchset is adding
> extra Elf header checks specifically for powerpc, and the module
> loader usually provides arch-specific hooks via weak symbols. We are
> just providing an new hook here, which should act as a no-op if it
> isn't used.
> 
> So if an architecture wants to provide extra header checks, it can do
> so by overriding the new weak symbol. Otherwise, the weak function acts as
> a noop. We also already have the existing elf_check_arch() check for each
> arch and that is *not* a weak symbol.

The way I read your patch the default elf_check_module_arch does not
call elf_check_arch?  Is that clearly called elsewhere and I'm just
dumb again?  Sorry for the distraction in that case :-/


Segher


Re: [PATCH v4 1/2] module: add elf_check_module_arch for module specific elf arch checks

2021-06-15 Thread Segher Boessenkool
On Tue, Jun 15, 2021 at 02:17:40PM +0200, Jessica Yu wrote:
> +int __weak elf_check_module_arch(Elf_Ehdr *hdr)
> +{
> +   return 1;
> +}

But is this a good idea?  It isn't useful to be able to attempt to load
a module not compiled for your architecture, and it increases the attack
surface tremendously.  These checks are one of the few things that can
*not* be weak symbols, imo.


Segher


Re: [PATCH] powerpc/32: Remove __main()

2021-06-08 Thread Segher Boessenkool
On Tue, Jun 08, 2021 at 05:22:51PM +, Christophe Leroy wrote:
> Comment says that __main() is there to make GCC happy.
> 
> It's been there since the implementation of ppc arch in Linux 1.3.45.
> 
> ppc32 is the only architecture having that. Even ppc64 doesn't have it.
> 
> Seems like GCC is still happy without it.
> 
> Drop it for good.

If you used G++ to build the kernel there could be a call to __main
inserted under some circumstances.   It is used in functions called
"main" if there is no other way to do initialisations (this should not
happen if you use -ffreestanding, and there should not be a function
called "main" anyway, but who knows).

Either way, yup, this is ancient history :-)


Segher


Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument

2021-05-20 Thread Segher Boessenkool
Hi!

On Thu, May 20, 2021 at 05:37:20PM +1000, Michael Ellerman wrote:
> Segher Boessenkool  writes:
> > On Tue, May 18, 2021 at 07:45:14PM -0500, Segher Boessenkool wrote:
> >> On Wed, May 19, 2021 at 10:26:22AM +1000, Michael Ellerman wrote:
> >> > Guenter Roeck  writes:
> >> > > Ah, sorry. I wasn't aware that the following is valid C code
> >> > >
> >> > > void f1()
> >> > > {
> >> > >  return f2();
> >> > >  ^^
> >> > > }
> >> > >
> >> > > as long as f2() is void as well. Confusing, but we live and learn.
> >> > 
> >> > It might be valid, but it's still bad IMHO.
> >> > 
> >> > It's confusing to readers, and serves no useful purpose.
> >> 
> >> And it actually explicitly is undefined behaviour in C90 already
> >> (3.6.6.4 in C90, 6.8.6.4 in C99 and later).
> 
> We use gnu89, which presumably does not make it UB.

Indeed.  That is kind of implied by the "as a GNU extension" below, but
some explicit statement would be better, yup.

> > ... but there is a GCC extension that allows this by default:
> > <https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wreturn-type>
> >   For C only, warn about a 'return' statement with an expression in a
> >   function whose return type is 'void', unless the expression type is
> >   also 'void'.  As a GNU extension, the latter case is accepted
> >   without a warning unless '-Wpedantic' is used.
> 
> There's no chance we'll ever enable -Wpedantic,

Good, because -pedantic adds a lot of much more annoying warnings as
well.  I find this extension questionable (like Guenter says it is
confusing and has no purpose), so the only thing it is "good" for is it
causes long email threads ;-)

Other than those things it is harmless though.

> so I guess it's allowed
> for practical purposes. I guess clang must accept it too or we'd be
> seeing warnings from it.

Yup.


Segher


Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Segher Boessenkool
On Wed, May 19, 2021 at 03:06:49PM +, Joakim Tjernlund wrote:
> On Wed, 2021-05-19 at 09:38 -0500, Segher Boessenkool wrote:
> > On Wed, May 19, 2021 at 06:42:40PM +1000, Nicholas Piggin wrote:
> > > Excerpts from Joakim Tjernlund's message of May 19, 2021 6:08 pm:
> > > > I always figured the ppc way was superior. It begs the question if not 
> > > > the other archs should
> > > > change instead?
> > > 
> > > It is superior in some ways, not enough to be worth being different.
> > 
> > The PowerPC syscall ABI *requires* using cr0.3 for indicating errors,
> > you will have to do that whether you conflate the concepts of return
> > code and error indicator or not!
> > 
> > > Other archs are unlikely to change because it would be painful for
> > > not much benefit.
> > 
> > Other archs cannot easily change for much the same reason :-)
> 
> Really? I figured you could just add extra error indication in kernel syscall 
> I/F.
> Eventually user space could migrate to the new indication.

You seem to assume all user space uses glibc, or *any* libc even?  This
is false.  Some programs do system calls directly.  Do not break the
kernel ABI :-)


Segher


Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Segher Boessenkool
On Wed, May 19, 2021 at 06:42:40PM +1000, Nicholas Piggin wrote:
> Excerpts from Joakim Tjernlund's message of May 19, 2021 6:08 pm:
> > I always figured the ppc way was superior. It begs the question if not the 
> > other archs should
> > change instead?
> 
> It is superior in some ways, not enough to be worth being different.

The PowerPC syscall ABI *requires* using cr0.3 for indicating errors,
you will have to do that whether you conflate the concepts of return
code and error indicator or not!

> Other archs are unlikely to change because it would be painful for
> not much benefit.

Other archs cannot easily change for much the same reason :-)

> New system calls just should be made to not return
> error numbers.

Which sometimes is a difficult / non-natural / clumsy thing to do.


Segher


Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument

2021-05-19 Thread Segher Boessenkool
On Wed, May 19, 2021 at 06:37:44AM -0700, Guenter Roeck wrote:
> On 5/19/21 5:03 AM, Segher Boessenkool wrote:
> >On Tue, May 18, 2021 at 07:45:14PM -0500, Segher Boessenkool wrote:
> >>And it actually explicitly is undefined behaviour in C90 already
> >>(3.6.6.4 in C90, 6.8.6.4 in C99 and later).
> >
> >... but there is a GCC extension that allows this by default:
> ><https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wreturn-type>
> >   For C only, warn about a 'return' statement with an expression in a
> >   function whose return type is 'void', unless the expression type is
> >   also 'void'.  As a GNU extension, the latter case is accepted
> >   without a warning unless '-Wpedantic' is used.
> 
> In C99:
> 
> "6.8.6.4 The return statement
> Constraints
> 
> A return statement with an expression shall not appear in a function whose 
> return type
> is void. A return statement without an expression shall only appear in a 
> function
> whose return type is void."
> 
> Sounds like invalid to me, not just undefined behavior.

I don't know what "invalid" would mean here other than UB, it isn't a
specific defined term, unlike the latter, which is precisely defined in
3.4.3/1:
  undefined behavior
  behavior, upon use of a nonportable or erroneous program construct or
  of erroneous data, for which this International Standard imposes no
  requirements

This is the strongest thing the standard can say, it is not Law, it does
not prohibit anyone from doing anything :-)

"Shall" and "shall not" X means it is undefined behaviour if X (or its
inverse)  is violated.  See 4.2:
  If a ''shall'' or ''shall not'' requirement that appears outside of a
  constraint or runtime-constraint is violated, the behavior is
  undefined.  Undefined behavior is otherwise indicated in this
  International Standard by the words ''undefined behavior'' or by the
  omission of any explicit definition of behavior.  There is no
  difference in emphasis among these three; they all describe ''behavior
  that is undefined''.
which also explains that what you call "invalid" has undefined behaviour
just as well, most likely.


Segher


Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument

2021-05-19 Thread Segher Boessenkool
On Tue, May 18, 2021 at 07:45:14PM -0500, Segher Boessenkool wrote:
> On Wed, May 19, 2021 at 10:26:22AM +1000, Michael Ellerman wrote:
> > Guenter Roeck  writes:
> > > Ah, sorry. I wasn't aware that the following is valid C code
> > >
> > > void f1()
> > > {
> > >  return f2();
> > >  ^^
> > > }
> > >
> > > as long as f2() is void as well. Confusing, but we live and learn.
> > 
> > It might be valid, but it's still bad IMHO.
> > 
> > It's confusing to readers, and serves no useful purpose.
> 
> And it actually explicitly is undefined behaviour in C90 already
> (3.6.6.4 in C90, 6.8.6.4 in C99 and later).

... but there is a GCC extension that allows this by default:
<https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wreturn-type>
  For C only, warn about a 'return' statement with an expression in a
  function whose return type is 'void', unless the expression type is
  also 'void'.  As a GNU extension, the latter case is accepted
  without a warning unless '-Wpedantic' is used.


Segher


Re: [PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument

2021-05-18 Thread Segher Boessenkool
On Wed, May 19, 2021 at 10:26:22AM +1000, Michael Ellerman wrote:
> Guenter Roeck  writes:
> > Ah, sorry. I wasn't aware that the following is valid C code
> >
> > void f1()
> > {
> >  return f2();
> >  ^^
> > }
> >
> > as long as f2() is void as well. Confusing, but we live and learn.
> 
> It might be valid, but it's still bad IMHO.
> 
> It's confusing to readers, and serves no useful purpose.

And it actually explicitly is undefined behaviour in C90 already
(3.6.6.4 in C90, 6.8.6.4 in C99 and later).


Segher


Re: [PATCH kernel v3] powerpc/makefile: Do not redefine $(CPP) for preprocessor

2021-05-17 Thread Segher Boessenkool
Hi!

On Mon, May 17, 2021 at 01:23:11PM +1000, Alexey Kardashevskiy wrote:
> On 5/14/21 18:46, Segher Boessenkool wrote:
> >On Fri, May 14, 2021 at 11:42:32AM +0900, Masahiro Yamada wrote:
> >>In my best guess, the reason why powerpc adding the endian flag to CPP
> >>is this line in arch/powerpc/kernel/vdso64/vdso64.lds.S
> >>
> >>#ifdef __LITTLE_ENDIAN__
> >>OUTPUT_FORMAT("elf64-powerpcle", "elf64-powerpcle", "elf64-powerpcle")
> >>#else
> >>OUTPUT_FORMAT("elf64-powerpc", "elf64-powerpc", "elf64-powerpc")
> >>#endif
> >
> >Which is equivalent to
> >
> >#ifdef __LITTLE_ENDIAN__
> >OUTPUT_FORMAT("elf64-powerpcle")
> >#else
> >OUTPUT_FORMAT("elf64-powerpc")
> >#endif
> >
> >so please change that at the same time if you touch this :-)
> 
> "If you touch this" approach did not work well with this patch so sorry 
> but no ;)
> 
> and for a separate patch, I'll have to dig since when it is equal, do 
> you know?

Since 1994, when the three-arg version was introduced (the one-arg
version is from 1992).

> >>__LITTLE_ENDIAN__  is defined by powerpc gcc and clang.
> >
> >This predefined macro is required by the newer ABIs, but all older
> 
> That's good so I'll stick to it.

Great.

> >You can just write -mbig and -mlittle btw.  Those aren't available on
> >all targets, but neither are the long-winded -m{big,little}-endian
> >option names.  Pet peeve, I know :-)
> 
> I am looking the same guarantees across modern enough gcc and clang and 
> I am not sure all of the above is valid for clang 10.0.something (or 
> whatever we say we support) ;)

-mbig/-mlittle is supported in GCC since times immemorial.  Whether LLVM
supports it as well just depends on how good their emulation is, I have
no idea.


Segher


Re: [PATCH v2 05/13] powerpc: use linux/unaligned/le_struct.h on LE power7

2021-05-14 Thread Segher Boessenkool
Hi Arnd,

On Fri, May 14, 2021 at 12:00:53PM +0200, Arnd Bergmann wrote:
> Little-endian POWER7 kernels disable
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS because that is not supported on
> the hardware, but the kernel still uses direct load/store for explicti
> get_unaligned()/put_unaligned().
> 
> I assume this is a mistake that leads to power7 having to trap and fix
> up all these unaligned accesses at a noticeable performance cost.
> 
> The fix is completely trivial, just remove the file and use the
> generic version that gets it right.

LE p7 isn't supported (it requires special firmware), and no one uses it
anymore, also not for development.  It was used for powerpc64le-linux
development before p8 was widely available.


Segher


Re: [PATCH kernel v3] powerpc/makefile: Do not redefine $(CPP) for preprocessor

2021-05-14 Thread Segher Boessenkool
Hi!

On Fri, May 14, 2021 at 11:42:32AM +0900, Masahiro Yamada wrote:
> In my best guess, the reason why powerpc adding the endian flag to CPP
> is this line in arch/powerpc/kernel/vdso64/vdso64.lds.S
> 
> #ifdef __LITTLE_ENDIAN__
> OUTPUT_FORMAT("elf64-powerpcle", "elf64-powerpcle", "elf64-powerpcle")
> #else
> OUTPUT_FORMAT("elf64-powerpc", "elf64-powerpc", "elf64-powerpc")
> #endif

Which is equivalent to

#ifdef __LITTLE_ENDIAN__
OUTPUT_FORMAT("elf64-powerpcle")
#else
OUTPUT_FORMAT("elf64-powerpc")
#endif

so please change that at the same time if you touch this :-)

> __LITTLE_ENDIAN__  is defined by powerpc gcc and clang.

This predefined macro is required by the newer ABIs, but all older
compilers have it as well.  _LITTLE_ENDIAN is not supported on all
platforms (but it is if your compiler targets Linux, which you cannot
necessarily rely on).  These macros are PowerPC-specific.

For GCC, for all targets, you can say
  #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
You do not need any of the other *ORDER__ macros in most cases.
See "info cpp" for the sordid details.

> [2] powerpc-linux-gnu-gcc + -mlittle-endian-> __LITTLE_ENDIAN__ is defined

You can just write -mbig and -mlittle btw.  Those aren't available on
all targets, but neither are the long-winded -m{big,little}-endian
option names.  Pet peeve, I know :-)

HtH,


Segher


Re: [PATCH] powerpc: Force inlining of csum_add()

2021-05-12 Thread Segher Boessenkool
On Wed, May 12, 2021 at 04:43:33PM +0200, Christophe Leroy wrote:
> Le 12/05/2021 à 16:31, Segher Boessenkool a écrit :
> >On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote:
> >>Le 11/05/2021 à 12:51, Segher Boessenkool a écrit :
> >>>Something seems to have decided this asm is more expensive than it is.
> >>>That isn't always avoidable -- the compiler cannot look inside asms --
> >>>but it seems it could be improved here.
> >>>
> >>>Do you have (or can make) a self-contained testcase?
> >>
> >>I have not tried, and I fear it might be difficult, because on a kernel
> >>build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such
> >>an issue.
> >
> >Yeah.  Sometimes you can force some of the decisions, but that usually
> >requires knowing too many GCC internals :-/
> >
> >>>>And there is even one completely unused instance of csum_add().
> >>>
> >>>That is strange, that should never happen.
> >>
> >>It seems that several .o include unused versions of csum_add. After the
> >>final link, one remains (in addition to the used one) in vmlinux.
> >
> >But it is a static function, so it should not end up in any object file
> >where it isn't used.
> 
> Well  did I dream ?
> 
> Now I only find one extra .o with unused csum_add() : That's 
> net/ipv6/exthdrs.o
> It matches the one found in vmlinux.
> 
> Are you interested in -fdump-tree-einline-all for that one as well ?

Sure.  Hopefully it will show more :-)


Segher


Re: [PATCH] powerpc: Force inlining of csum_add()

2021-05-12 Thread Segher Boessenkool
On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote:
> Le 11/05/2021 à 12:51, Segher Boessenkool a écrit :
> >Something seems to have decided this asm is more expensive than it is.
> >That isn't always avoidable -- the compiler cannot look inside asms --
> >but it seems it could be improved here.
> >
> >Do you have (or can make) a self-contained testcase?
> 
> I have not tried, and I fear it might be difficult, because on a kernel 
> build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such 
> an issue.

Yeah.  Sometimes you can force some of the decisions, but that usually
requires knowing too many GCC internals :-/

> >>And there is even one completely unused instance of csum_add().
> >
> >That is strange, that should never happen.
> 
> It seems that several .o include unused versions of csum_add. After the 
> final link, one remains (in addition to the used one) in vmlinux.

But it is a static function, so it should not end up in any object file
where it isn't used.

> >>In the non-inlined version, the first sum with 0 was performed.
> >>Here it is skipped.
> >
> >That is because of how __builtin_constant_p works, most likely.  As we
> >discussed elsewhere it is evaluated before all forms of loop unrolling.
> 
> But we are not talking about loop unrolling here, are we ?

Oh, right you are, but that doesn't change much.  The
_builtin_constant_p(len) is evaluated long before the compiler sees len
is a constant here.

> It seems that the reason here is that __builtin_constant_p() is evaluated 
> long after GCC decided to not inline that call to csum_add().

Yes, it seems we do not currently do even trivial inlining except very
early in the compiler.

Thanks,


Segher


Re: [PATCH kernel v2] powerpc/makefile: Do not redefine $(CPP) for preprocessor

2021-05-12 Thread Segher Boessenkool
On Wed, May 12, 2021 at 01:48:53PM +1000, Alexey Kardashevskiy wrote:
> >Oh!  I completely missed those few lines.  Sorry for that :-(
> 
> Well, I probably should have made it a separate patch anyway, I'll 
> repost separately.

Thanks.

> >To compensate a bit:
> >
> >>It still puzzles me why we need -C
> >>(preserve comments in the preprocessor output) flag here.
> >
> >It is so that a human can look at the output and read it.  Comments are
> >very significant to human readers :-)
> 
> I seriously doubt anyone ever read those :)

I am pretty sure whoever wrote it did!

> I suspect this is to pull 
> all the licenses in one place and do some checking but I did not dig deep.

I don't see the point in that, but :-)


Segher


Re: [PATCH kernel v2] powerpc/makefile: Do not redefine $(CPP) for preprocessor

2021-05-11 Thread Segher Boessenkool
On Tue, May 11, 2021 at 11:30:17PM +1000, Alexey Kardashevskiy wrote:
> >In any case, please mention the reasoning (and the fact that you are
> >removing these flags!) in the commit message.  Thanks!
> 
> but i did mention this, the last paragraph... they are duplicated.

Oh!  I completely missed those few lines.  Sorry for that :-(

To compensate a bit:

> It still puzzles me why we need -C
> (preserve comments in the preprocessor output) flag here.

It is so that a human can look at the output and read it.  Comments are
very significant to human readers :-)


Segher


Re: [PATCH v2 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

2021-05-11 Thread Segher Boessenkool
On Tue, May 11, 2021 at 07:18:32AM -0500, Sathvika Vasireddy wrote:
> This adds emulation support for the following instruction:
>* Set Boolean (setb)
> 
> Signed-off-by: Sathvika Vasireddy 

This looks fine to me, thanks!

Reviewed-by: Segher Boessenkool 


Segher


Re: [PATCH kernel v2] powerpc/makefile: Do not redefine $(CPP) for preprocessor

2021-05-11 Thread Segher Boessenkool
Hi!

On Tue, May 11, 2021 at 02:48:12PM +1000, Alexey Kardashevskiy wrote:
> --- a/arch/powerpc/kernel/vdso32/Makefile
> +++ b/arch/powerpc/kernel/vdso32/Makefile
> @@ -44,7 +44,7 @@ asflags-y := -D__VDSO32__ -s
>  
>  obj-y += vdso32_wrapper.o
>  targets += vdso32.lds
> -CPPFLAGS_vdso32.lds += -P -C -Upowerpc
> +CPPFLAGS_vdso32.lds += -C
>  
>  # link rule for the .so file, .lds has to be first
>  $(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday.o 
> FORCE

> --- a/arch/powerpc/kernel/vdso64/Makefile
> +++ b/arch/powerpc/kernel/vdso64/Makefile
> @@ -30,7 +30,7 @@ ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
>  asflags-y := -D__VDSO64__ -s
>  
>  targets += vdso64.lds
> -CPPFLAGS_vdso64.lds += -P -C -U$(ARCH)
> +CPPFLAGS_vdso64.lds += -C
>  
>  # link rule for the .so file, .lds has to be first
>  $(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday.o 
> FORCE

Why are you removing -P and -Upowerpc here?  "powerpc" is a predefined
macro on powerpc-linux (no underscores or anything, just the bareword).
This is historical, like "unix" and "linux".  If you use the C
preprocessor for things that are not C code (like the kernel does here)
you need to undefine these macros, if anything in the files you run
through the preprocessor contains those words, or funny / strange / bad
things will happen.  Presumably at some time in the past it did contain
"powerpc" somewhere.

-P is to inhibit line number output.  Whatever consumes the
preprocessor output will have to handle line directives if you remove
this flag.  Did you check if this will work for everything that uses
$(CPP)?

In any case, please mention the reasoning (and the fact that you are
removing these flags!) in the commit message.  Thanks!


Segher


Re: [PATCH] powerpc: Force inlining of csum_add()

2021-05-11 Thread Segher Boessenkool
Hi!

On Tue, May 11, 2021 at 06:08:06AM +, Christophe Leroy wrote:
> Commit 328e7e487a46 ("powerpc: force inlining of csum_partial() to
> avoid multiple csum_partial() with GCC10") inlined csum_partial().
> 
> Now that csum_partial() is inlined, GCC outlines csum_add() when
> called by csum_partial().

> c064fb28 :
> c064fb28: 7c 63 20 14 addcr3,r3,r4
> c064fb2c: 7c 63 01 94 addze   r3,r3
> c064fb30: 4e 80 00 20 blr

Could you build this with -fdump-tree-einline-all and send me the
results?  Or open a GCC PR yourself :-)

Something seems to have decided this asm is more expensive than it is.
That isn't always avoidable -- the compiler cannot look inside asms --
but it seems it could be improved here.

Do you have (or can make) a self-contained testcase?

> The sum with 0 is useless, should have been skipped.

That isn't something the compiler can do anything about (not sure if you
were suggesting that); it has to be done in the user code (and it tries
to already, see below).

> And there is even one completely unused instance of csum_add().

That is strange, that should never happen.

> ./arch/powerpc/include/asm/checksum.h: In function '__ip6_tnl_rcv':
> ./arch/powerpc/include/asm/checksum.h:94:22: warning: inlining failed in call 
> to 'csum_add': call is unlikely and code size would grow [-Winline]
>94 | static inline __wsum csum_add(__wsum csum, __wsum addend)
>   |  ^~~~
> ./arch/powerpc/include/asm/checksum.h:172:31: note: called from here
>   172 | sum = csum_add(sum, (__force __wsum)*(const 
> u32 *)buff);
>   |   
> ^

At least we say what happened.  Progress!  :-)

> In the non-inlined version, the first sum with 0 was performed.
> Here it is skipped.

That is because of how __builtin_constant_p works, most likely.  As we
discussed elsewhere it is evaluated before all forms of loop unrolling.

The patch looks perfect of course :-)

Reviewed-by: Segher Boessenkool 


Segher


> --- a/arch/powerpc/include/asm/checksum.h
> +++ b/arch/powerpc/include/asm/checksum.h
> @@ -91,7 +91,7 @@ static inline __sum16 csum_tcpudp_magic(__be32 saddr, 
> __be32 daddr, __u32 len,
>  }
>  
>  #define HAVE_ARCH_CSUM_ADD
> -static inline __wsum csum_add(__wsum csum, __wsum addend)
> +static __always_inline __wsum csum_add(__wsum csum, __wsum addend)
>  {
>  #ifdef __powerpc64__
>   u64 res = (__force u64)csum;


Re: [PATCH] powerpc/legacy_serial: Fix UBSAN: array-index-out-of-bounds

2021-05-10 Thread Segher Boessenkool
On Sat, May 08, 2021 at 06:36:21AM +, Christophe Leroy wrote:
> UBSAN complains when a pointer is calculated with invalid
> 'legacy_serial_console' index, allthough the index is verified
> before dereferencing the pointer.

Addressing like this is UB already.

You could just move this:

> - if (legacy_serial_console < 0)
> - return 0;

to before

> - struct legacy_serial_info *info = 
> _serial_infos[legacy_serial_console];
> - struct plat_serial8250_port *port = 
> _serial_ports[legacy_serial_console];

and no other change is necessary.


Segher


Re: UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56

2021-05-07 Thread Segher Boessenkool
On Fri, May 07, 2021 at 10:31:42AM +0200, Christophe Leroy wrote:
> The function is as follows, so when legacy_serial_console == -1 as in your 
> situation, the pointers are just not used.

And it is still undefined behaviour.  C11 6.5.6/8 has
  If both the pointer operand and the result point to elements of the
  same array object, or one past the last element of the array object,
  the evaluation shall not produce an overflow; otherwise, the behavior
  is undefined.
(this is for adding an integer to a pointer).

> When I look into the generated code (UBSAN not selected), we see the 
> verification and the bail-out is done prior to any calculation based on 
> legacy_serial_console.

Yes, you got lucky.  Generating the code you wanted is one of the things
the compiler is allowed to do for UB.

> So, is it normal that UBSAN reports an error here ?

Yes.  It detected undefined behaviour just fine, it did exactly its
job :-)


Segher


Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-05 Thread Segher Boessenkool
On Tue, May 04, 2021 at 10:15:34PM +1000, Michael Ellerman wrote:
> Segher Boessenkool  writes:
> > On Mon, May 03, 2021 at 10:51:41AM +1000, Nicholas Piggin wrote:
> >> then ELF ABIv2 is more explanatory about it being an abi change
> >> rather than base elf change, even if it's not the "correct" name.
> >
> > I very much disagree.  "ELF ABIv2" is completely meaningless.
> 
> Except:
> 
> $ readelf -h /bin/true
> ELF Header:
>   Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
>   Class: ELF64
>   Data:  2's complement, little endian
>   Version:   1 (current)
>   OS/ABI:UNIX - System V
>   ABI Version:   0
>   Type:  DYN (Shared object file)
>   Machine:   PowerPC64
>   Version:   0x1
>   Entry point address:   0x1990
>   Start of program headers:  64 (bytes into file)
>   Start of section headers:  66176 (bytes into file)
>   Flags: 0x2, abiv2
>   ^

Ha :-)

This can also print "abiv1" or even "abiv0", btw:

case EM_PPC64:
  if (e_flags & EF_PPC64_ABI)
{
  char abi[] = ", abiv0";

  abi[6] += e_flags & EF_PPC64_ABI;
  strcat (buf, abi);
}
  break;

This is only in readelf, and the main audience for readelf is the people
who know all details about what means what already.  But, do you have a
suggestion for better output here?


Segher


Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-03 Thread Segher Boessenkool
Hi!

On Mon, May 03, 2021 at 10:51:41AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of May 3, 2021 3:55 am:
> > On Wed, Apr 29, 2020 at 10:57:16AM +1000, Nicholas Piggin wrote:
> >> Excerpts from Segher Boessenkool's message of April 29, 2020 9:40 am:
> >> I blame toolchain for -mabi=elfv2 ! And also some blame on ABI document 
> >> which is called ELF V2 ABI rather than ELF ABI V2 which would have been 
> >> unambiguous.
> > 
> > At least ELFv2 ABI is correct.  "ELF ABI v2" is not.
> > 
> >> I can go through and change all my stuff and config options to ELF_ABI_v2.
> > 
> > Please don't.  It is wrong.
> 
> Then I'm not sure what the point of your previous mail was, what did I 
> miss?

I asked if you could make it clearer to people who do not know what this
is whether they want to use it.  Or that was my intention, anyhow :-/

> > Both the original PowerPC ELF ABI and the
> > ELFv2 one have versions themselves.  Also, the base ELF standard has a
> > version, and is set up so there can be incompatible versions even!  Of
> > course it still is version 1 to this day, but :-)
> 
> The point was for people who don't know ELFv2 has a specific meaning for 
> powerpc,

It does not have *any* meaning outside of Power.  But people who do not
know what it is can assume the wrong things about it.  It isn't a great
name because of that :-(

(It's not as bad as the MIPS ABIs -- an older one is called "new" :-) )

> then ELF ABIv2 is more explanatory about it being an abi change
> rather than base elf change, even if it's not the "correct" name.

I very much disagree.  "ELF ABIv2" is completely meaningless.

> If you don't want that then good, I also prefer to just use ELFv2. I 

Good :-)

> think people who change this option can easily look up the name in 
> toolchain and other docs.

Yeah.  As long as the defaults are good, whoever blows themselves up has
only themselves to blame :-P


Segher


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-02 Thread Segher Boessenkool
On Sun, May 02, 2021 at 02:23:01PM -0700, Joe Perches wrote:
> On Sun, 2021-05-02 at 15:32 -0500, Segher Boessenkool wrote:
> > On Sun, May 02, 2021 at 01:00:28PM -0700, Joe Perches wrote:
> []
> > > Perhaps 8 might be best as that has a __diag warning control mechanism.
> > 
> > I have no idea what you mean?
> 
> ? read the last bit of compiler-gcc.h

Ah, you mean
#pragma GCC diagnostic
(which has existed since GCC 4.2).  Does anything in this __diag stuff
require GCC 8?  Other than that this is hardcoded here :-)


Segher


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-02 Thread Segher Boessenkool
On Sun, May 02, 2021 at 01:00:28PM -0700, Joe Perches wrote:
> On Sun, 2021-05-02 at 13:30 -0500, Segher Boessenkool wrote:
> > On Sat, May 01, 2021 at 07:41:53PM -0700, Joe Perches wrote:
> > > Why not raise the minimum gcc compiler version even higher?
> 
> On Sun, 2021-05-02 at 13:37 -0500, Segher Boessenkool wrote:
> > Everyone should always use an as new release as practical
> 
> []
> 
> > The latest GCC 5 release is only three and a half years old.
> 
> You argue slightly against yourself here.

I don't?

> Yes, it's mostly a question of practicality vs latest.
> 
> clang requires a _very_ recent version.
> gcc _could_ require a later version.
> Perhaps 8 might be best as that has a __diag warning control mechanism.

I have no idea what you mean?

> gcc 8.1 is now 3 years old today.

And there will be a new GCC 8 release very soon now!

The point is, you inconvenience users if you require a compiler version
they do not already have.  Five years might be fine, but three years is
not.


Segher


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-02 Thread Segher Boessenkool
On Sun, May 02, 2021 at 12:15:38AM +0900, Masahiro Yamada wrote:
> The current minimum GCC version is 4.9 except ARCH=arm64 requiring
> GCC 5.1.
> 
> When we discussed last time, we agreed to raise the minimum GCC version
> to 5.1 globally. [1]
> 
> I'd like to propose GCC 5.2 to clean up arch/powerpc/Kconfig as well.

Both of these are GCC version 5.  GCC 5.1 is the first release of that,
GCC 5.2 the second, etc.  Everyone should always use an as new release
as practical, since many bugs will be fixed, and nothing else changed.

See .

So, this means everyone using GCC 5 should be using the GCC 5.5 release!

If there is something about 5.1 that makes it produce bad kernels on
some arch, make that arch's Makefile complain?  Same with binutils etc.


Segher


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-02 Thread Segher Boessenkool
On Sat, May 01, 2021 at 07:41:53PM -0700, Joe Perches wrote:
> Why not raise the minimum gcc compiler version even higher?

The latest GCC 5 release is only three and a half years old.  Do you
really want to require bleeding edge tools?


Segher


Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-02 Thread Segher Boessenkool
On Wed, Apr 29, 2020 at 10:57:16AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of April 29, 2020 9:40 am:
> I blame toolchain for -mabi=elfv2 ! And also some blame on ABI document 
> which is called ELF V2 ABI rather than ELF ABI V2 which would have been 
> unambiguous.

At least ELFv2 ABI is correct.  "ELF ABI v2" is not.

> I can go through and change all my stuff and config options to ELF_ABI_v2.

Please don't.  It is wrong.  Both the original PowerPC ELF ABI and the
ELFv2 one have versions themselves.  Also, the base ELF standard has a
version, and is set up so there can be incompatible versions even!  Of
course it still is version 1 to this day, but :-)


Segher


Re: [PATCH 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

2021-04-23 Thread Segher Boessenkool
On Fri, Apr 23, 2021 at 12:26:57PM +0200, Gabriel Paubert wrote:
> On Thu, Apr 22, 2021 at 06:26:16PM -0500, Segher Boessenkool wrote:
> > > But this can be made jump free :-):
> > > 
> > >   int tmp = regs->ccr << ra;
> > >   op->val = (tmp >> 31) | ((tmp >> 30) & 1);
> > 
> > The compiler will do so automatically (or think of some better way to
> > get the same result); in source code, what matters most is readability,
> > or clarity in general (also clarity to the compiler).
> 
> I just did a test (trivial code attached) and the original code always
> produces one conditional branch at -O2, at least with the cross-compiler
> I have on Debian (gcc 8.3). I have tested both -m32 and -m64. The 64 bit
> version produces an unnecessary "extsw", so I wrote the second version
> splitting the setting of the return value which gets rid of it.

That is an older compiler, and it will be out-of-service any day now.

It depends on what compiler flags you use, and what version of the ISA
you are targetting.

> The second "if" is fairly simple to optimize and the compiler does it
> properly.

Yeah.

> Of course with my suggestion the compiler does not produce any branch. 
> But it needs a really good comment.

Or you could try and help improve the compiler ;-)  You can do this
without writing compiler code yourself, by writing up some good
enhancement request in bugzilla.

The wider and more OoO the processors become, the more important it
becomes to have branch-free code, in situations where the branches would
not be well-predictable.

> > (Right shifts of negative numbers are implementation-defined in C,
> > fwiw -- but work like you expect in GCC).
> 
> Well, I'm not worried about it, since I'd expect a compiler that does
> logical right shifts on signed valued to break so much code that it
> would be easily noticed (also in the kernel).

Yup.  And it *is* defined for signed values, as long as they are
non-negative (the common case).

> > > > Also, even people who write LE have the bigger end on the left normally
> > > > (they just write some things right-to-left, and other things
> > > > left-to-right).
> > > 
> > > Around 1985, I had a documentation for the the National's 32032
> > > (little-endian) processor family, and all the instruction encodings were
> > > presented with the LSB on the left and MSB on the right.
> > 
> > Ouch!  Did they write "regular" numbers with the least significant digit
> > on the left as well?
> 
> No, they were not that sadistic!

But more inconsistent :-)


Segher


Re: [PATCH 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

2021-04-22 Thread Segher Boessenkool
Hi!

On Fri, Apr 23, 2021 at 12:16:18AM +0200, Gabriel Paubert wrote:
> On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote:
> > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
> > > Sathvika Vasireddy  writes:
> > > > +   if ((regs->ccr) & (1 << (31 - ra)))
> > > > +   op->val = -1;
> > > > +   else if ((regs->ccr) & (1 << (30 - ra)))
> > > > +   op->val = 1;
> > > > +   else
> > > > +   op->val = 0;
> > 
> > It imo is clearer if written
> > 
> > if ((regs->ccr << ra) & 0x8000)
> > op->val = -1;
> > else if ((regs->ccr << ra) & 0x4000)
> > op->val = 1;
> > else
> > op->val = 0;
> > 
> > but I guess not everyone agrees :-)
> 
> But this can be made jump free :-):
> 
>   int tmp = regs->ccr << ra;
>   op->val = (tmp >> 31) | ((tmp >> 30) & 1);

The compiler will do so automatically (or think of some better way to
get the same result); in source code, what matters most is readability,
or clarity in general (also clarity to the compiler).

(Right shifts of negative numbers are implementation-defined in C,
fwiw -- but work like you expect in GCC).

> (IIRC the srawi instruction sign-extends its result to 64 bits).

If you consider it to work on 32-bit inputs, yeah, that is a simple way
to express it.

> > > CR field:  76543210
> > > bit:  0123 0123 0123 0123 0123 0123 0123 0123
> > > normal bit #: 0.31
> > > ibm bit #:   31.0
> > 
> > The bit numbers in CR fields are *always* numbered left-to-right.  I
> > have never seen anyone use LE for it, anyway.
> > 
> > Also, even people who write LE have the bigger end on the left normally
> > (they just write some things right-to-left, and other things
> > left-to-right).
> 
> Around 1985, I had a documentation for the the National's 32032
> (little-endian) processor family, and all the instruction encodings were
> presented with the LSB on the left and MSB on the right.

Ouch!  Did they write "regular" numbers with the least significant digit
on the left as well?

> BTW on these processors, the immediate operands and the offsets (1, 2 or
> 4 bytes) for the addressing modes were encoded in big-endian byte order,
> but I digress. Consistency is overrated ;-)

Inconsistency is the spice of life, yeah :-)


Segher


Re: [PATCH 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

2021-04-22 Thread Segher Boessenkool
Hi!

On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
> Sathvika Vasireddy  writes:
> Ok, if I've understood correctly...
> 
> > +   ra = ra & ~0x3;
> 
> This masks off the bits of RA that are not part of BTF:
> 
> ra is in [0, 31] which is [0b0, 0b1]
> Then ~0x3 = ~0b00011
> ra = ra & 0b11100
> 
> This gives us then,
> ra = btf << 2; or
> btf = ra >> 2;

Yes.  In effect, you want the offset in bits of the CR field, which is
just fine like this.  But a comment would not hurt.

> Let's then check to see if your calculations read the right fields.
> 
> > +   if ((regs->ccr) & (1 << (31 - ra)))
> > +   op->val = -1;
> > +   else if ((regs->ccr) & (1 << (30 - ra)))
> > +   op->val = 1;
> > +   else
> > +   op->val = 0;

It imo is clearer if written

if ((regs->ccr << ra) & 0x8000)
op->val = -1;
else if ((regs->ccr << ra) & 0x4000)
op->val = 1;
else
op->val = 0;

but I guess not everyone agrees :-)

> CR field:  76543210
> bit:  0123 0123 0123 0123 0123 0123 0123 0123
> normal bit #: 0.31
> ibm bit #:   31.0

The bit numbers in CR fields are *always* numbered left-to-right.  I
have never seen anyone use LE for it, anyway.

Also, even people who write LE have the bigger end on the left normally
(they just write some things right-to-left, and other things
left-to-right).

> Checkpatch does have one complaint:
> 
> CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr'
> #30: FILE: arch/powerpc/lib/sstep.c:1971:
> + if ((regs->ccr) & (1 << (31 - ra)))
> 
> I don't really mind the parenteses: I think you are safe to ignore
> checkpatch here unless someone else complains :)

I find them annoying.  If there are too many parentheses, it is hard to
see at a glance what groups where.  Also, a suspicious reader might
think there is something special going on (with macros for example).

This is simple code of course, but :-)

> If you do end up respinning the patch, I think it would be good to make
> the maths a bit clearer. I think it works because a left shift of 2 is
> the same as multiplying by 4, but it would be easier to follow if you
> used a temporary variable for btf.

It is very simple.  The BFA instruction field is closely related to the
BI instruction field, which is 5 bits, and selects one of the 32 bits in
the CR.  If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit
numbers of all four bits in the selected CR field.  So the "& ~3" does
all you need.  It is quite pretty :-)


Segher


Re: [PATCH] powerpc: Initialize local variable fdt to NULL in elf64_load()

2021-04-22 Thread Segher Boessenkool
On Thu, Apr 22, 2021 at 08:05:27AM +, David Laight wrote:
> > (Does anyone - and by anyone I mean any large distro - compile with
> > local variables inited by the compiler?)
> 
> There are compilers that initialise locals to zero for 'debug' builds
> and leave the 'random' for optimised 'release' builds.
> Lets not test what we are releasing!

Yeah, that's the worst of all possible worlds.

> I also think there is a new option to gcc (or clang?) to initialise
> on-stack structures and arrays to ensure garbage isn't passed.
> That seems to be a horrid performance hit!
> Especially in userspace where large stack allocations are almost free.
> 
> Any auto-initialise ought to be with a semi-random value
> (especially not zero) so that it is never right and doesn't
> lead to lazy coding.

Many compilers did something like this, decades ago -- for debug builds.


Segher


Re: [PATCH 2/2] powerpc: add ALTIVEC support to lib/ when PPC_FPU not set

2021-04-19 Thread Segher Boessenkool
On Mon, Apr 19, 2021 at 03:38:02PM +0200, Christophe Leroy wrote:
> Le 19/04/2021 à 15:32, Segher Boessenkool a écrit :
> >On Sun, Apr 18, 2021 at 01:17:26PM -0700, Randy Dunlap wrote:
> >>Add ldstfp.o to the Makefile for CONFIG_ALTIVEC and add
> >>externs for get_vr() and put_vr() in lib/sstep.c to fix the
> >>build errors.
> >
> >>  obj-$(CONFIG_PPC_FPU) += ldstfp.o
> >>+obj-$(CONFIG_ALTIVEC)  += ldstfp.o
> >
> >It is probably a good idea to split ldstfp.S into two, one for each of
> >the two configuration options?
> >
> 
> Or we can build it all the time and #ifdef the FPU part.
> 
> Because it contains FPU, ALTIVEC and VSX stuff.

So it becomes an empty object file if none of the options are selected?
Good idea :-)


Segher


Re: [PATCH 2/2] powerpc: add ALTIVEC support to lib/ when PPC_FPU not set

2021-04-19 Thread Segher Boessenkool
Hi!

On Sun, Apr 18, 2021 at 01:17:26PM -0700, Randy Dunlap wrote:
> Add ldstfp.o to the Makefile for CONFIG_ALTIVEC and add
> externs for get_vr() and put_vr() in lib/sstep.c to fix the
> build errors.

>  obj-$(CONFIG_PPC_FPU)+= ldstfp.o
> +obj-$(CONFIG_ALTIVEC)+= ldstfp.o

It is probably a good idea to split ldstfp.S into two, one for each of
the two configuration options?


Segher


Re: PPC_FPU, ALTIVEC: enable_kernel_fp, put_vr, get_vr

2021-04-18 Thread Segher Boessenkool
On Sun, Apr 18, 2021 at 06:24:29PM +0200, Christophe Leroy wrote:
> Le 17/04/2021 à 22:17, Randy Dunlap a écrit :
> >Should the code + Kconfigs/Makefiles handle that kind of
> >kernel config or should ALTIVEC always mean PPC_FPU as well?
> 
> As far as I understand, Altivec is completely independant of FPU in Theory. 

And, as far as the hardware is concerned, in practice as well.

> So it should be possible to use Altivec without using FPU.

Yup.

> However, until recently, it was not possible to de-activate FPU support on 
> book3s/32. I made it possible in order to reduce unneccessary processing on 
> processors like the 832x that has no FPU.

The processor has to implement FP to be compliant to any version of
PowerPC, as far as I know?  So that is all done by emulation, including
all the registers?  Wow painful.

> As far as I can see in cputable.h/.c, 832x is the only book3s/32 without 
> FPU, and it doesn't have ALTIVEC either.

602 doesn't have double-precision hardware, also no 64-bit FP registers.
But that CPU was never any widely used :-)

> So we can in the future ensure that Altivec can be used without FPU 
> support, but for the time being I think it is OK to force selection of FPU 
> when selecting ALTIVEC in order to avoid build failures.

It is useful to allow MSR[VEC,FP]=1,0 but yeah there are no CPUs that
have VMX (aka AltiVec) but that do not have FP.  I don't see how making
that artificial dependency buys anything, but maybe it does?

> >I have patches to fix the build errors with the config as
> >reported but I don't know if that's the right thing to do...

Neither do we, we cannot see those patches :-)


Segher


Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 03:32:04PM +, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 14 April 2021 16:19
> ...
> > > Could the kernel use GCC builtin atomic functions instead ?
> > >
> > > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> > 
> > Certainly that should work fine for the simpler cases that the atomic
> > operations are meant to provide.  But esp. for not-so-simple cases the
> > kernel may require some behaviour provided by the existing assembler
> > implementation, and not by the atomic builtins.
> > 
> > I'm not saying this cannot work, just that some serious testing will be
> > needed.  If it works it should be the best of all worlds, so then it is
> > a really good idea yes :-)
> 
> I suspect they just add an extra layer of abstraction that makes it
> even more difficult to verify and could easily get broken by a compiler
> update (etc).

I would say it uses an existing facility, instead of creating a kernel-
specific one.

> The other issue is that the code needs to be correct with compiled
> with (for example) -O0.
> That could very easily break anything except the asm implementation
> if additional memory accesses and/or increased code size cause grief.

The compiler generates correct code.  New versions of the compiler or
old, -O0 or not, under any phase of the moon.

Of course sometimes the compiler is broken, but there are pre-existing
ways of dealing with that, and there is no reason at all to think this
would break more often than random other code.


Segher


Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 02:42:51PM +0200, Christophe Leroy wrote:
> Le 14/04/2021 à 14:24, Segher Boessenkool a écrit :
> >On Wed, Apr 14, 2021 at 12:01:21PM +1000, Nicholas Piggin wrote:
> >>Would be nice if we could let the compiler deal with it all...
> >>
> >>static inline unsigned long lr(unsigned long *mem)
> >>{
> >> unsigned long val;
> >>
> >> /*
> >>  * This doesn't clobber memory but want to avoid memory 
> >>  operations
> >>  * moving ahead of it
> >>  */
> >> asm volatile("ldarx %0, %y1" : "=r"(val) : "Z"(*mem) : 
> >> "memory");
> >>
> >> return val;
> >>}
> >
> >(etc.)
> >
> >That can not work reliably: the compiler can put random instructions
> >between the larx and stcx. this way, and you then do not have guaranteed
> >forward progress anymore.  It can put the two in different routines
> >(after inlining and other interprocedural optimisations), duplicate
> >them, make a different number of copies of them, etc.
> >
> >Nothing of that is okay if you want to guarantee forward progress on all
> >implementations, and also not if you want to have good performance
> >everywhere (or anywhere even).  Unfortunately you have to write all
> >larx/stcx. loops as one block of assembler, so that you know exactly
> >what instructions will end up in your binary.
> >
> >If you don't, it will fail mysteriously after random recompilations, or
> >have performance degradations, etc.  You don't want to go there :-)
> >
> 
> Could the kernel use GCC builtin atomic functions instead ?
> 
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Certainly that should work fine for the simpler cases that the atomic
operations are meant to provide.  But esp. for not-so-simple cases the
kernel may require some behaviour provided by the existing assembler
implementation, and not by the atomic builtins.

I'm not saying this cannot work, just that some serious testing will be
needed.  If it works it should be the best of all worlds, so then it is
a really good idea yes :-)


Segher


Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 12:01:21PM +1000, Nicholas Piggin wrote:
> Would be nice if we could let the compiler deal with it all...
> 
> static inline unsigned long lr(unsigned long *mem)
> {
> unsigned long val;
> 
> /*
>  * This doesn't clobber memory but want to avoid memory operations
>  * moving ahead of it
>  */
> asm volatile("ldarx %0, %y1" : "=r"(val) : "Z"(*mem) : "memory");
> 
> return val;
> }

(etc.)

That can not work reliably: the compiler can put random instructions
between the larx and stcx. this way, and you then do not have guaranteed
forward progress anymore.  It can put the two in different routines
(after inlining and other interprocedural optimisations), duplicate
them, make a different number of copies of them, etc.

Nothing of that is okay if you want to guarantee forward progress on all
implementations, and also not if you want to have good performance
everywhere (or anywhere even).  Unfortunately you have to write all
larx/stcx. loops as one block of assembler, so that you know exactly
what instructions will end up in your binary.

If you don't, it will fail mysteriously after random recompilations, or
have performance degradations, etc.  You don't want to go there :-)


Segher


Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible

2021-04-13 Thread Segher Boessenkool
On Tue, Apr 13, 2021 at 06:33:19PM +0200, Christophe Leroy wrote:
> Le 12/04/2021 à 23:54, Segher Boessenkool a écrit :
> >On Thu, Apr 08, 2021 at 03:33:44PM +, Christophe Leroy wrote:
> >>For clear bits, on 32 bits 'rlwinm' can be used instead or 'andc' for
> >>when all bits to be cleared are consecutive.
> >
> >Also on 64-bits, as long as both the top and bottom bits are in the low
> >32-bit half (for 32 bit mode, it can wrap as well).
> 
> Yes. But here we are talking about clearing a few bits, all other ones must 
> remain unchanged. An rlwinm on PPC64 will always clear the upper part, 
> which is unlikely what we want.

No, it does not.  It takes the low 32 bits of the source reg, duplicated
to the top half as well, then rotated, then ANDed with the mask (which
can wrap around).  This isn't very often very useful, but :-)

(One useful operation is splatting 32 bits to both halves of a 64-bit
register, which is just rlwinm d,s,0,1,0).

If you only look at the low 32 bits, it does exactly the same as on
32-bit implementations.

> >>For the time being only
> >>handle the single bit case, which we detect by checking whether the
> >>mask is a power of two.
> >
> >You could look at rs6000_is_valid_mask in GCC:
> >   
> > <https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/rs6000/rs6000.c;h=48b8efd732b251c059628096314848305deb0c0b;hb=HEAD#l11148>
> >used by rs6000_is_valid_and_mask immediately after it.  You probably
> >want to allow only rlwinm in your case, and please note this checks if
> >something is a valid mask, not the inverse of a valid mask (as you
> >want here).
> 
> This check looks more complex than what I need. It is used for both rlw... 
> and rld..., and it calculates the operants.  The only thing I need is to 
> validate the mask.

It has to do exactly the same thing for rlwinm as for all 64-bit
variants (rldicl, rldicr, rldic).

One side effect of calculation the bit positions with exact_log2 is that
that returns negative if the argument is not a power of two.

Here is a simpler way, that handles all cases:  input in "u32 val":

if (!val)
return nonono;
if (val & 1)
val = ~val; // make the mask non-wrapping
val += val & -val;  // adding the low set bit should result in
// at most one bit set
if (!(val & (val - 1)))
return okidoki_all_good;

> I found a way: By anding the mask with the complement of itself rotated by 
> left bits to 1, we identify the transitions from 0 to 1. If the result is a 
> power of 2, it means there's only one transition so the mask is as expected.

That does not handle all cases (it misses all bits set at least).  Which
isn't all that interesting of course, but is a valid mask (but won't
clear any bits, so not too interesting for your specific case :-) )


Segher


Re: [PATCH v2 1/1] powerpc/iommu: Enable remaining IOMMU Pagesizes present in LoPAR

2021-04-12 Thread Segher Boessenkool
On Fri, Apr 09, 2021 at 02:36:16PM +1000, Alexey Kardashevskiy wrote:
> On 08/04/2021 19:04, Michael Ellerman wrote:
> +#define QUERY_DDW_PGSIZE_4K  0x01
> +#define QUERY_DDW_PGSIZE_64K 0x02
> +#define QUERY_DDW_PGSIZE_16M 0x04
> +#define QUERY_DDW_PGSIZE_32M 0x08
> +#define QUERY_DDW_PGSIZE_64M 0x10
> +#define QUERY_DDW_PGSIZE_128M0x20
> +#define QUERY_DDW_PGSIZE_256M0x40
> +#define QUERY_DDW_PGSIZE_16G 0x80
> >>>
> >>>I'm not sure the #defines really gain us much vs just putting the
> >>>literal values in the array below?
> >>
> >>Then someone says "u magic values" :) I do not mind either way. 
> >>Thanks,
> >
> >Yeah that's true. But #defining them doesn't make them less magic, if
> >you only use them in one place :)
> 
> Defining them with "QUERY_DDW" in the names kinda tells where they are 
> from. Can also grep QEMU using these to see how the other side handles 
> it. Dunno.

And *not* defining anything reduces the mental load a lot.  You can add
a comment at the single spot you use them, explaining what this is, in a
much better way!

Comments are *good*.


Segher


Re: [PATCH v1 2/2] powerpc/atomics: Use immediate operand when possible

2021-04-12 Thread Segher Boessenkool
Hi!

On Thu, Apr 08, 2021 at 03:33:45PM +, Christophe Leroy wrote:
> +#define ATOMIC_OP(op, asm_op, dot, sign) \
>  static __inline__ void atomic_##op(int a, atomic_t *v)   
> \
>  {\
>   int t;  \
>   \
>   __asm__ __volatile__(   \
>  "1:  lwarx   %0,0,%3 # atomic_" #op "\n" \
> - #asm_op " %0,%2,%0\n"   \
> + #asm_op "%I2" dot " %0,%0,%2\n" \
>  "stwcx.  %0,0,%3 \n" \
>  "bne-1b\n"   \
> - : "=" (t), "+m" (v->counter)  \
> - : "r" (a), "r" (>counter)\
> + : "=" (t), "+m" (v->counter)  \
> + : "r"#sign (a), "r" (>counter)   \
>   : "cc");\
>  }\

You need "b" (instead of "r") only for "addi".  You can use "addic"
instead, which clobbers XER[CA], but *all* inline asm does, so that is
not a downside here (it is also not slower on any CPU that matters).

> @@ -238,14 +238,14 @@ static __inline__ int atomic_fetch_add_unless(atomic_t 
> *v, int a, int u)
>  "1:  lwarx   %0,0,%1 # atomic_fetch_add_unless\n\
>   cmpw0,%0,%3 \n\
>   beq 2f \n\
> - add %0,%2,%0 \n"
> + add%I2  %0,%0,%2 \n"
>  "stwcx.  %0,0,%1 \n\
>   bne-    1b \n"
>   PPC_ATOMIC_EXIT_BARRIER
> -"subf%0,%2,%0 \n\
> +"sub%I2  %0,%0,%2 \n\
>  2:"
> - : "=" (t)
> - : "r" (>counter), "r" (a), "r" (u)
> + : "=" (t)
> + : "r" (>counter), "rI" (a), "r" (u)
>   : "cc", "memory");

Same here.

Nice patches!

Acked-by: Segher Boessenkool 


Segher


Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible

2021-04-12 Thread Segher Boessenkool
Hi!

On Thu, Apr 08, 2021 at 03:33:44PM +, Christophe Leroy wrote:
> For clear bits, on 32 bits 'rlwinm' can be used instead or 'andc' for
> when all bits to be cleared are consecutive.

Also on 64-bits, as long as both the top and bottom bits are in the low
32-bit half (for 32 bit mode, it can wrap as well).

> For the time being only
> handle the single bit case, which we detect by checking whether the
> mask is a power of two.

You could look at rs6000_is_valid_mask in GCC:
  

used by rs6000_is_valid_and_mask immediately after it.  You probably
want to allow only rlwinm in your case, and please note this checks if
something is a valid mask, not the inverse of a valid mask (as you
want here).

So yes this is pretty involved :-)

Your patch looks good btw.  But please use "n", not "i", as constraint?


Segher


Re: [PATCH v3] powerpc/traps: Enhance readability for trap types

2021-04-10 Thread Segher Boessenkool
On Sat, Apr 10, 2021 at 11:42:41AM +0200, Christophe Leroy wrote:
> Le 10/04/2021 à 02:04, Michael Ellerman a écrit :
> >I think these can all be avoided by defining most of the values
> >regardless of what platform we're building for. Only the values that
> >overlap need to be kept behind an ifdef.
> 
> Even if values overlap we can keep multiple definitions for the same value.

That works, but it can lead to puzzling bugs.  Of course we all *like*
working on more challenging bugs, but :-)


Segher


Re: [PATCH v3] powerpc/traps: Enhance readability for trap types

2021-04-09 Thread Segher Boessenkool
On Fri, Apr 09, 2021 at 06:14:19PM +0200, Christophe Leroy wrote:
> >+#define INTERRUPT_SYSTEM_RESET0x100
> 
> INT_SRESET

SRESET exists on many PowerPC, it means "soft reset".  Not the same
thing at all.

I think "INT" is not a great prefix fwiw, there are many things you can
abbr to "INT".

> >+#define INTERRUPT_DATA_SEGMENT0x380
> 
> INT_DSEG

exceptions-64s.S calls this "DSLB" (I remember "DSSI" though -- but neither
is a very official name).  It probably is a good idea to look at that
existing code, not make up even more new names :-)

> >+#define INTERRUPT_DOORBELL0xa00
> 
> INT_DBELL

That saves three characters and makes it very not understandable.


Segher


Re: [PATCH 2/2] powerpc: make 'boot_text_mapped' static

2021-04-09 Thread Segher Boessenkool
Hi!

On Thu, Apr 08, 2021 at 07:04:35AM +0200, Christophe Leroy wrote:
> Le 08/04/2021 à 03:18, Yu Kuai a écrit :
> >-int boot_text_mapped __force_data = 0;
> >+static int boot_text_mapped __force_data;
> 
> Are you sure the initialisation to 0 can be removed ? Usually 
> initialisation to 0 is not needed because not initialised variables go in 
> the BSS section which is zeroed at startup. But here the variable is 
> flagged with __force_data so it is not going in the BSS section.

Any non-automatic (i.e. function-scope, not static) variable is
initialised to 0.  See e.g. C11 6.7.9/10 (this has been like that since
times immemorial, C90 anyway).


Segher


Re: [PATCH] powerpc/32: Remove powerpc specific definition of 'ptrdiff_t'

2021-04-05 Thread Segher Boessenkool
On Mon, Apr 05, 2021 at 09:57:27AM +, Christophe Leroy wrote:
> For unknown reason, old commit d27dfd388715 ("Import pre2.0.8")
> changed 'ptrdiff_t' from 'int' to 'long'.
> 
> GCC expects it as 'int' really,

It isn't actually defined in the ABI as far as I can see (neither the
old document or the new one), but GCC has defined it as "int" since
forever (which was in 1995), for anything using the SYSV ABI (which
includes powerpc-linux).

>  defines it as 'int', and
> defines 'size_t' and 'ssize_t' exactly as powerpc do, so
> remove the powerpc specific definitions and fallback on
> generic ones.
> 
> Signed-off-by: Christophe Leroy 

Thanks!

Acked-by: Segher Boessenkool 


Segher


Re: [PATCH v2] powerpc/traps: Enhance readability for trap types

2021-04-01 Thread Segher Boessenkool
On Thu, Apr 01, 2021 at 06:01:29PM +1000, Nicholas Piggin wrote:
> Excerpts from Michael Ellerman's message of April 1, 2021 12:39 pm:
> > Segher Boessenkool  writes:
> >> On Wed, Mar 31, 2021 at 08:58:17PM +1100, Michael Ellerman wrote:
> >>> So perhaps:
> >>> 
> >>>   EXC_SYSTEM_RESET
> >>>   EXC_MACHINE_CHECK
> >>>   EXC_DATA_STORAGE
> >>>   EXC_DATA_SEGMENT
> >>>   EXC_INST_STORAGE
> >>>   EXC_INST_SEGMENT
> >>>   EXC_EXTERNAL_INTERRUPT
> >>>   EXC_ALIGNMENT
> >>>   EXC_PROGRAM_CHECK
> >>>   EXC_FP_UNAVAILABLE
> >>>   EXC_DECREMENTER
> >>>   EXC_HV_DECREMENTER
> >>>   EXC_SYSTEM_CALL
> >>>   EXC_HV_DATA_STORAGE
> >>>   EXC_PERF_MONITOR
> >>
> >> These are interrupt (vectors), not exceptions.  It doesn't matter all
> >> that much, but confusing things more isn't useful either!  There can be
> >> multiple exceptions that all can trigger the same interrupt.
> > 
> > Yeah I know, but I think that ship has already sailed as far as the
> > naming we have in the kernel.
> 
> It has, but there are also several other ships also sailing in different 
> directions. It could be worse though, at least they are not sideways in 
> the Suez.

:-)

> > We have over 250 uses of "exc", and several files called "exception"
> > something.
> > 
> > Using "interrupt" can also be confusing because Linux uses that to mean
> > "external interrupt".
> > 
> > But I dunno, maybe INT or VEC is clearer? .. or TRAP :)
> 
> We actually already have defines that follow Segher's suggestion, it's 
> just that they're hidden away in a KVM header.
> 
> #define BOOK3S_INTERRUPT_SYSTEM_RESET   0x100
> #define BOOK3S_INTERRUPT_MACHINE_CHECK  0x200
> #define BOOK3S_INTERRUPT_DATA_STORAGE   0x300
> #define BOOK3S_INTERRUPT_DATA_SEGMENT   0x380
> #define BOOK3S_INTERRUPT_INST_STORAGE   0x400
> #define BOOK3S_INTERRUPT_INST_SEGMENT   0x480
> #define BOOK3S_INTERRUPT_EXTERNAL   0x500
> #define BOOK3S_INTERRUPT_EXTERNAL_HV0x502
> #define BOOK3S_INTERRUPT_ALIGNMENT  0x600
> 
> It would take just a small amount of work to move these to general 
> powerpc header, add #ifdefs for Book E/S where the numbers differ,
> and remove the BOOK3S_ prefix.
> 
> I don't mind INTERRUPT_ but INT_ would be okay too. VEC_ actually
> doesn't match what Book E does (which is some weirdness to map some
> of them to match Book S but not all, arguably we should clean that
> up too and just use vector numbers consistently, but the INTERRUPT_
> prefix would still be valid if we did that).

VEC also is pretty incorrect: there is code at those addresses, not
vectors pointing to code (as similar things on some other architectures
have).  Everyone understands what it means of course, except it is
confusing with a thing we *do* have on Power called VEC (the MSR bit) :-P

(And TRAP is just one cause of 700...)


Segher


Re: [PATCH v2] powerpc/traps: Enhance readability for trap types

2021-04-01 Thread Segher Boessenkool
On Thu, Apr 01, 2021 at 10:55:58AM +0800, Xiongwei Song wrote:
> Segher Boessenkool  于2021年4月1日周四 上午6:15写道:
> 
> > On Wed, Mar 31, 2021 at 08:58:17PM +1100, Michael Ellerman wrote:
> > > So perhaps:
> > >
> > >   EXC_SYSTEM_RESET
> > >   EXC_MACHINE_CHECK
> > >   EXC_DATA_STORAGE
> > >   EXC_DATA_SEGMENT
> > >   EXC_INST_STORAGE
> > >   EXC_INST_SEGMENT
> > >   EXC_EXTERNAL_INTERRUPT
> > >   EXC_ALIGNMENT
> > >   EXC_PROGRAM_CHECK
> > >   EXC_FP_UNAVAILABLE
> > >   EXC_DECREMENTER
> > >   EXC_HV_DECREMENTER
> > >   EXC_SYSTEM_CALL
> > >   EXC_HV_DATA_STORAGE
> > >   EXC_PERF_MONITOR
> >
> > These are interrupt (vectors), not exceptions.  It doesn't matter all
> > that much, but confusing things more isn't useful either!  There can be
> > multiple exceptions that all can trigger the same interrupt.
> >
> >  When looking at the reference manual of e500 and e600 from NXP
>  official, they call them as interrupts.While looking at the "The
> Programming Environments"
>  that is also from NXP, they call them exceptions. Looks like there is
>  no explicit distinction between interrupts and exceptions.

The architecture documents have always called it interrupts.  The PEM
says it calls them exceptions instead, but they are called interrupts in
the architecture (and the PEM says that, too).

> Here is the "The Programming Environments" link:
> https://www.nxp.com.cn/docs/en/user-guide/MPCFPE_AD_R1.pdf

That document is 24 years old.  The architecture is still published,
new versions regularly.

> As far as I know, the values of interrupts or exceptions above are defined
> explicitly in reference manual or the programming environments.

They are defined in the architecture.

> Could
> you please provide more details about multiple exceptions with the same
> interrupts?

The simplest example is 700, program interrupt.  There are many causes
for it, including all the exceptions in FPSCR: VX, ZX, OX, UX, XX, and
VX is actually divided into nine separate cases itself.  There also are
the various causes of privileged instruction type program interrupts,
and  the trap type program interrupt, but the FEX ones are most obvious
here.


Segher


Re: [PATCH v2] powerpc/traps: Enhance readability for trap types

2021-03-31 Thread Segher Boessenkool
On Wed, Mar 31, 2021 at 08:58:17PM +1100, Michael Ellerman wrote:
> So perhaps:
> 
>   EXC_SYSTEM_RESET
>   EXC_MACHINE_CHECK
>   EXC_DATA_STORAGE
>   EXC_DATA_SEGMENT
>   EXC_INST_STORAGE
>   EXC_INST_SEGMENT
>   EXC_EXTERNAL_INTERRUPT
>   EXC_ALIGNMENT
>   EXC_PROGRAM_CHECK
>   EXC_FP_UNAVAILABLE
>   EXC_DECREMENTER
>   EXC_HV_DECREMENTER
>   EXC_SYSTEM_CALL
>   EXC_HV_DATA_STORAGE
>   EXC_PERF_MONITOR

These are interrupt (vectors), not exceptions.  It doesn't matter all
that much, but confusing things more isn't useful either!  There can be
multiple exceptions that all can trigger the same interrupt.


Segher


Re: [PATCH v7] powerpc/irq: Inline call_do_irq() and call_do_softirq()

2021-03-25 Thread Segher Boessenkool
On Wed, Mar 24, 2021 at 11:26:01PM +1100, Michael Ellerman wrote:
> Christophe Leroy  writes:
> > Hmm. It is the first time we use named parameters in powerpc assembly, 
> > isn't it ?

> Yeah I'd like us to use it more, I think it helps readability a lot.

..in some cases.  Not in most cases :-(

> > Wondering, how would the below look like with named parameters (from 
> > __put_user_asm2_goto) ?
> >
> > stw%X1 %L0, %L1
> 
> Not sure, possibly that's too complicated for it :)

asm("stw%X[name1] %L[name0],%L[name1]" :: [name0]"r"(x), [name1]"m"(p));

Yes, it is not more readable *at all*.


Segher


Re: [RFC PATCH 6/8] powerpc/mm/book3s64/hash: drop pre 2.06 tlbiel for clang

2021-03-24 Thread Segher Boessenkool
On Wed, Mar 24, 2021 at 10:51:05AM -0500, Segher Boessenkool wrote:
> The variants with fewer operands have those coded as 0 in the
> instruction.  All of this is backwards compatible.

I was reminded that this is broken in ISA 2.03.  Ouch.


Segher


Re: [RFC PATCH 6/8] powerpc/mm/book3s64/hash: drop pre 2.06 tlbiel for clang

2021-03-24 Thread Segher Boessenkool
Hi!

On Tue, Mar 23, 2021 at 04:11:10AM +1000, Nicholas Piggin wrote:
> The problem in this file is we generate 3 different tlbie and tlbiel
> instructions with the same mnemonic corresponding to different ISA
> versions.
> 
> This might actually be the one good place to use .machine to make sure 
> the assembler generates the right thing.

Yes, but then hide that in some macro.

(And "the one good place"?  I protest!)

> I'm not entirely sure it is
> foolproof because some of the times the instruction variant is inferred
> by the number of arguments it has yet arguments can be implicit. PPC_
> define would be exactly explicit.

The variants with fewer operands have those coded as 0 in the
instruction.  All of this is backwards compatible.

> But if it can be made reasonably robust with .machine then I'd be okay
> with that too.

Since you should do a macro (or inline) for it anyway, you could just
do .long, all the nastiness is in one place anyway then, it won't make
much difference what you do.  It should be documented there as well :-)


Segher


Re: [PATCH] powerpc/kexec: Don't use .machine ppc64 in trampoline_64.S

2021-03-15 Thread Segher Boessenkool
Hi!

On Mon, Mar 15, 2021 at 02:41:59PM +1100, Michael Ellerman wrote:
> The ".machine" directive allows changing the machine for which code is
> being generated. It's equivalent to passing an -mcpu option on the
> command line.
> 
> Although it can be useful, it's generally a bad idea because it adds
> another way to influence code generation separate from the flags
> passed via the build system. ie. if we need to build different pieces
> of code with different flags we should do that via our Makefiles, not
> using ".machine".

It does not influence code generation.  It says which instructions are
valid, instead.  There are a few cases where the same mnemonic will
generate a different binary encoding depending on machine selected,
maybe you mean that?

It is *normal* to use .machine push/pop and a specific .machine around
instructions that require a machine other than what you are building
for.  The compiler does this itself, and it is the recommended way to
use "foreign" instructions in inline assembler.

That said...

> However as best as I can tell the ".machine" directive in
> trampoline_64.S is not necessary at all.
> 
> It was added in commit 0d97631392c2 ("powerpc: Add purgatory for
> kexec_file_load() implementation."), which created the file based on
> the kexec-tools purgatory. It may be/have-been necessary in the
> kexec-tools version, but we have a completely different build system,
> and we already pass the desired CPU flags, eg:
> 
>   gcc ... -m64 -Wl,-a64 -mabi=elfv2 -Wa,-maltivec -Wa,-mpower4 -Wa,-many
>   ... arch/powerpc/purgatory/trampoline_64.S
> 
> So drop the ".machine" directive and rely on the assembler flags.

> - .machine ppc64

Please make sure to test this on a big endian config.

A ppc64le-linux assembler defaults to power8.  A ppc64-linux assembler
defaults to power3 (that is the same as .machine ppc64).  Or maybe it
makes it power4?  I get lost :-)

It certainly *should* work, but, test please :-)

(And with a *default* powerpc64-linux config, not one that defaults to
power7 or power8 or similar!  Arnd's toolchains at
<https://mirrors.edge.kernel.org/pub/tools/crosstool/>
are fine for this.)


Reviewed-by: Segher Boessenkool 


Segher


Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure

2021-03-15 Thread Segher Boessenkool
On Mon, Mar 15, 2021 at 04:38:52PM +, David Laight wrote:
> From: Rasmus Villemoes
> > Sent: 15 March 2021 16:24
> > On 12/03/2021 03.29, Segher Boessenkool wrote:
> > > On Tue, Mar 09, 2021 at 06:19:30AM +, Christophe Leroy wrote:
> > >> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE,
> > >> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5
> > >> generates a call to _restgpr_31_x.
> > >
> > >> I don't know if there is a way to tell GCC not to emit that call, 
> > >> because at the end we get more
> > instructions than needed.
> > >
> > > The function is required by the ABI, you need to have it.
> > >
> > > You get *fewer* insns statically, and that is what -Os is about: reduce
> > > the size of the binaries.
> > 
> > Is there any reason to not just always build the vdso with -O2? It's one
> > page/one VMA either way, and the vdso is about making certain system
> > calls cheaper, so if unconditional -O2 could save a few cycles compared
> > to -Os, why not? (And if, as it seems, there's only one user within the
> > DSO of _restgpr_31_x, yes, the overall size of the .text segment
> > probably increases slightly).
> 
> Sometimes -Os generates such horrid code you really never want to use it.
> A classic is on x86 where it replaces 'load register with byte constant'
> with 'push byte' 'pop register'.
> The code is actually smaller but the execution time is horrid.
> 
> There are also cases where -O2 actually generates smaller code.

Yes, as with all heuristics it doesn't always work out.  But usually -Os
is smaller.

> Although you may need to disable loop unrolling (often dubious at best)
> and either force or disable some function inlining.

The cases where GCC does loop unrolling at -O2 always help quite a lot.
Or, do you have a counter-example?  We'd love to see one.

And yup, inlining is hard.  GCC's heuristics there are very good
nowadays, but any single decision has big effects.  Doing the important
spots manually (always_inline or noinline) has good payoff.


Segher


Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure

2021-03-15 Thread Segher Boessenkool
On Mon, Mar 15, 2021 at 05:23:44PM +0100, Rasmus Villemoes wrote:
> On 12/03/2021 03.29, Segher Boessenkool wrote:
> > On Tue, Mar 09, 2021 at 06:19:30AM +, Christophe Leroy wrote:
> >> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE,
> >> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5
> >> generates a call to _restgpr_31_x.
> > 
> >> I don't know if there is a way to tell GCC not to emit that call, because 
> >> at the end we get more instructions than needed.
> > 
> > The function is required by the ABI, you need to have it.
> > 
> > You get *fewer* insns statically, and that is what -Os is about: reduce
> > the size of the binaries.
> 
> Is there any reason to not just always build the vdso with -O2? It's one
> page/one VMA either way, and the vdso is about making certain system
> calls cheaper, so if unconditional -O2 could save a few cycles compared
> to -Os, why not? (And if, as it seems, there's only one user within the
> DSO of _restgpr_31_x, yes, the overall size of the .text segment
> probably increases slightly).

You can use exactly the same reasoning for using -O2 instead of -Os
anywhere else.

-Os doesn't mean "smaller code, but only where that is reasonable".  It
means "smaller code".


Segher


Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure

2021-03-11 Thread Segher Boessenkool
Hi!

On Tue, Mar 09, 2021 at 06:19:30AM +, Christophe Leroy wrote:
> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE,
> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5
> generates a call to _restgpr_31_x.

> I don't know if there is a way to tell GCC not to emit that call, because at 
> the end we get more instructions than needed.

The function is required by the ABI, you need to have it.

You get *fewer* insns statically, and that is what -Os is about: reduce
the size of the binaries.

(The only reason you get such problems is because Linux does not have
all of libgcc.  You can have that *and* have some symbols cause link
errors, it isn't rocket science).


Segher


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-10 Thread Segher Boessenkool
Hi!

On Wed, Mar 10, 2021 at 11:32:20AM +, Mark Rutland wrote:
> On Tue, Mar 09, 2021 at 04:05:32PM -0600, Segher Boessenkool wrote:
> > On Tue, Mar 09, 2021 at 04:05:23PM +, Mark Rutland wrote:
> > > On Thu, Mar 04, 2021 at 03:54:48PM -0600, Segher Boessenkool wrote:
> > > > On Thu, Mar 04, 2021 at 02:57:30PM +, Mark Rutland wrote:
> > > > > It looks like GCC is happy to give us the function-entry-time FP if 
> > > > > we use
> > > > > __builtin_frame_address(1),
> > > > 
> > > > From the GCC manual:
> > > >  Calling this function with a nonzero argument can have
> > > >  unpredictable effects, including crashing the calling program.  As
> > > >  a result, calls that are considered unsafe are diagnosed when the
> > > >  '-Wframe-address' option is in effect.  Such calls should only be
> > > >  made in debugging situations.
> > > > 
> > > > It *does* warn (the warning is in -Wall btw), on both powerpc and
> > > > aarch64.  Furthermore, using this builtin causes lousy code (it forces
> > > > the use of a frame pointer, which we normally try very hard to optimise
> > > > away, for good reason).
> > > > 
> > > > And, that warning is not an idle warning.  Non-zero arguments to
> > > > __builtin_frame_address can crash the program.  It won't on simpler
> > > > functions, but there is no real definition of what a simpler function
> > > > *is*.  It is meant for debugging, not for production use (this is also
> > > > why no one has bothered to make it faster).
> > > >
> > > > On Power it should work, but on pretty much any other arch it won't.
> > > 
> > > I understand this is true generally, and cannot be relied upon in
> > > portable code. However as you hint here for Power, I believe that on
> > > arm64 __builtin_frame_address(1) shouldn't crash the program due to the
> > > way frame records work on arm64, but I'll go check with some local
> > > compiler folk. I agree that __builtin_frame_address(2) and beyond
> > > certainly can, e.g.  by NULL dereference and similar.
> > 
> > I still do not know the aarch64 ABI well enough.  If only I had time!
> > 
> > > For context, why do you think this would work on power specifically? I
> > > wonder if our rationale is similar.
> > 
> > On most 64-bit Power ABIs all stack frames are connected together as a
> > linked list (which is updated atomically, importantly).  This makes it
> > possible to always find all previous stack frames.
> 
> We have something similar on arm64, where the kernel depends on being
> built with a frame pointer following the AAPCS frame pointer rules.

The huge difference is on Power this is about the stack itself: you do
not need a frame pointer at all for it (there is no specific register
named as frame pointer, even).

> Every stack frame contains a "frame record" *somewhere* within that
> stack frame, and the frame records are chained together as a linked
> list. The frame pointer points at the most recent frame record (and this
> is what __builtin_frame_address(0) returns).

> > See gcc.gnu.org/PR60109 for example.
> 
> Sure; I see that being true generally (and Ramana noted that on 32-bit
> arm a frame pointer wasn't mandated), but I think in this case we have a
> stronger target (and configuration) specific guarantee.

It sounds like it, yes.  You need to have a frame pointer in the ABI,
with pretty strong rules, and have everything follow those rules.

> > Is the frame pointer required?!
> 
> The arm64 Linux port mandates frame pointers for kernel code. It is
> generally possible to build code without frame pointers (e.g. userspace),
> but doing that for kernel code would be a bug.

I see.  And it even is less expensive to do this than on most machines,
because of register pair load/store instructions :-)

> > > > The real way forward is to bite the bullet and to no longer pretend you
> > > > can do a full backtrace from just the stack contents.  You cannot.
> > > 
> > > I think what you mean here is that there's no reliable way to handle the
> > > current/leaf function, right? If so I do agree.
> > 
> > No, I meant what I said.
> > 
> > There is the separate issue that you do not know where the return
> > address (etc.) is stored in a function that has not yet done a call
> > itself, sure.  You cannot assume anything the ABI does not tell you you
> > can depend on.
> 
> This is in the frame record per the AAPCS.

But you do not know where in the function it will store that.  It often
can be optimised by the compiler to only store the LR and FP on paths
where a call will happen later, and there is no way (without DWARF info
or similar) to know whether that has happened yet or not.

This is a well-known problem of course.  For the current function you
cannot know in general if there is an activation frame yet or not.


Segher


Re: PowerPC64 future proof kernel toc, revised for lld

2021-03-10 Thread Segher Boessenkool
On Wed, Mar 10, 2021 at 01:44:57PM +0100, Christophe Leroy wrote:
> 
> 
> Le 10/03/2021 à 13:25, Alan Modra a écrit :
> >On Wed, Mar 10, 2021 at 08:33:37PM +1100, Alexey Kardashevskiy wrote:
> >>One more question - the older version had a construct "DEFINED (.TOC.) ?
> >>.TOC. : ..." in case .TOC. is not defined (too old ld? too old gcc?) but 
> >>the
> >>newer patch seems assuming it is always defined, when was it added? I have
> >>the same check in SLOF, for example, do I still need it?
> >
> >.TOC. symbol support was first added 2012-11-06, so you need
> >binutils-2.24 or later to use .TOC. as a symbol.
> >
> 
> As of today, minimum requirement to build kernel is binutils 2.23, see 
> https://www.kernel.org/doc/html/latest/process/changes.html#current-minimal-requirements

The minimum GCC version required is 4.9, released April 2014, so it
would make sense to require binutils 2.24 at least as well: that was the
last binutils release before the GCC 4.9 release (it was end of 2013).

Generally you should make sure to always have a binutils at least as new
as your GCC (and newer almost always works just fine).


Segher


Re: [PATCH V2 2/2] powerpc/perf: Add platform specific check_attr_config

2021-03-10 Thread Segher Boessenkool
On Thu, Mar 11, 2021 at 12:16:25AM +1100, Alexey Kardashevskiy wrote:
> On 26/02/2021 17:50, Madhavan Srinivasan wrote:
> >+int isa3_X_check_attr_config(struct perf_event *ev)
> 
> "isa300" is used everywhere else to refer to ISA 3.00.

And that itself is a misspelling, there is nothing called "ISA 3.00" (it
is called "ISA 3.0").


Segher


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-09 Thread Segher Boessenkool
Hi!

On Tue, Mar 09, 2021 at 04:05:23PM +, Mark Rutland wrote:
> On Thu, Mar 04, 2021 at 03:54:48PM -0600, Segher Boessenkool wrote:
> > On Thu, Mar 04, 2021 at 02:57:30PM +, Mark Rutland wrote:
> > > It looks like GCC is happy to give us the function-entry-time FP if we use
> > > __builtin_frame_address(1),
> > 
> > From the GCC manual:
> >  Calling this function with a nonzero argument can have
> >  unpredictable effects, including crashing the calling program.  As
> >  a result, calls that are considered unsafe are diagnosed when the
> >  '-Wframe-address' option is in effect.  Such calls should only be
> >  made in debugging situations.
> > 
> > It *does* warn (the warning is in -Wall btw), on both powerpc and
> > aarch64.  Furthermore, using this builtin causes lousy code (it forces
> > the use of a frame pointer, which we normally try very hard to optimise
> > away, for good reason).
> > 
> > And, that warning is not an idle warning.  Non-zero arguments to
> > __builtin_frame_address can crash the program.  It won't on simpler
> > functions, but there is no real definition of what a simpler function
> > *is*.  It is meant for debugging, not for production use (this is also
> > why no one has bothered to make it faster).
> >
> > On Power it should work, but on pretty much any other arch it won't.
> 
> I understand this is true generally, and cannot be relied upon in
> portable code. However as you hint here for Power, I believe that on
> arm64 __builtin_frame_address(1) shouldn't crash the program due to the
> way frame records work on arm64, but I'll go check with some local
> compiler folk. I agree that __builtin_frame_address(2) and beyond
> certainly can, e.g.  by NULL dereference and similar.

I still do not know the aarch64 ABI well enough.  If only I had time!

> For context, why do you think this would work on power specifically? I
> wonder if our rationale is similar.

On most 64-bit Power ABIs all stack frames are connected together as a
linked list (which is updated atomically, importantly).  This makes it
possible to always find all previous stack frames.

> Are you aware of anything in particular that breaks using
> __builtin_frame_address(1) in non-portable code, or is this just a
> general sentiment of this not being a supported use-case?

It is not supported, and trying to do it anyway can crash: it can use
random stack contents as pointer!  Not really "random" of course, but
where it thinks to find a pointer into the previous frame, which is not
something it can rely on (unless the ABI guarantees it somehow).

See gcc.gnu.org/PR60109 for example.

> > > Unless we can get some strong guarantees from compiler folk such that we
> > > can guarantee a specific function acts boundary for unwinding (and
> > > doesn't itself get split, etc), the only reliable way I can think to
> > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > need some invasive rework.
> > 
> > You cannot get such a guarantee, other than not letting the compiler
> > see into the routine at all, like with assembler code (not inline asm,
> > real assembler code).
> 
> If we cannot reliably ensure this then I'm happy to go write an assembly
> trampoline to snapshot the state at a function call boundary (where our
> procedure call standard mandates the state of the LR, FP, and frame
> records pointed to by the FP).

Is the frame pointer required?!

> This'll require reworking a reasonable
> amount of code cross-architecture, so I'll need to get some more
> concrete justification (e.g. examples of things that can go wrong in
> practice).

Say you have a function that does dynamic stack allocation, then there
is usually no way to find the previous stack frame (without function-
specific knowledge).  So __builtin_frame_address cannot work (it knows
nothing about frames further up).

Dynamic stack allocation (alloca, or variable length automatic arrays)
is just the most common and most convenient example; it is not the only
case you have problems here.

> > The real way forward is to bite the bullet and to no longer pretend you
> > can do a full backtrace from just the stack contents.  You cannot.
> 
> I think what you mean here is that there's no reliable way to handle the
> current/leaf function, right? If so I do agree.

No, I meant what I said.

There is the separate issue that you do not know where the return
address (etc.) is stored in a function that has not yet done a call
itself, sure.  You cannot assume anything the ABI does not tell you you
can depend on.

> Beyond that I believe that arm64's frame records should be sufficient.

Do you have a simple linked list connecting all frames?  The aarch64 GCC
port does not define anything special here (DYNAMIC_CHAIN_ADDRESS), so
the default will be used: every frame pointer has to point to the
previous one, no exceptions whatsoever.


Segher


Re: [PATCH] Replace __toc_start + 0x8000 with .TOC.

2021-03-06 Thread Segher Boessenkool
Hi!

On Sat, Mar 06, 2021 at 09:14:33PM -0800, Fangrui Song wrote:
> TOC relocations are like GOT relocations on other architectures.
> However, unlike other architectures, GNU ld's ppc64 port defines .TOC.
> relative to the .got output section instead of the linker synthesized
> .got input section. LLD defines .TOC. as the .got input section plus
> 0x8000. When CONFIG_PPC_OF_BOOT_TRAMPOLINE=y,
> arch/powerpc/kernel/prom_init.o is built, and LLD computed .TOC. can be
> different from __toc_start defined by the linker script.
> 
> Simplify kernel_toc_addr with asm label .TOC. so that we can get rid of
> __toc_start.
> 
> With this change, powernv_defconfig with CONFIG_PPC_OF_BOOT_TRAMPOLINE=y
> is bootable with LLD. There is still an untriaged issue with Alexey's
> configuration.

Do you have any explanation why this *does* work, while the original
doesn't?  Some explanation that says *what* is wrong.  To me it doesn't
look like the kernel script is.


Segher


Re: [PATCH] powerpc: Fix instruction encoding for lis in ppc_function_entry()

2021-03-05 Thread Segher Boessenkool
On Thu, Mar 04, 2021 at 07:34:11AM +0530, Naveen N. Rao wrote:
> 'lis r2,N' is 'addis r2,0,N' and the instruction encoding in the macro
> LIS_R2 is incorrect (it currently maps to 'addis 0,r2,N'). Fix the same.

That is written "addis r0,r2,N" even.  Your patch looks fine otherwise,
so with that fix:

Acked-by: Segher Boessenkool 


Segher


Re: [PATCH v2 1/7] cmdline: Add generic function to build command line.

2021-03-05 Thread Segher Boessenkool
On Fri, Mar 05, 2021 at 01:49:03PM +0100, Christophe Leroy wrote:
> Le 05/03/2021 à 12:58, Michael Ellerman a écrit :
> >prom_init runs as an OF client, with the MMU off (except on some Apple
> >machines), and we don't own the MMU. So there's really nothing we can do :)
> >
> >Though now that I look at it, I don't think we should be doing this
> >level of commandline handling in prom_init. It should just grab the
> >value from firmware and pass it to the kernel proper, and then all the
> >prepend/append/force etc. logic should happen there.
> 
> But then, how do you handle the command line parameters that are needed by 
> prom_init ?
> 
> For instance, prom_init_mem() use 'prom_memory_limit', which comes from the 
> "mem=" option in the command line.

*Reading* it is easy, much easier than modifying it.


Segher


Re: [PATCH v2 1/7] cmdline: Add generic function to build command line.

2021-03-05 Thread Segher Boessenkool
On Fri, Mar 05, 2021 at 10:58:02PM +1100, Michael Ellerman wrote:
> Will Deacon  writes:
> > That's very similar to us; we're not relocated, although we are at least
> > in control of the MMU (which is using a temporary set of page-tables).
> 
> prom_init runs as an OF client, with the MMU off (except on some Apple
> machines), and we don't own the MMU. So there's really nothing we can do :)

You *could* take over all memory mapping.  This is complex, and I
estimate the change you get this to work correctly on all supported
systems to be between -400% and 0%.

And not very long later Linux jettisons OF completely anyway.

> Though now that I look at it, I don't think we should be doing this
> level of commandline handling in prom_init. It should just grab the
> value from firmware and pass it to the kernel proper, and then all the
> prepend/append/force etc. logic should happen there.

That sounds much simpler, yes :-)


Segher


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-05 Thread Segher Boessenkool
On Fri, Mar 05, 2021 at 07:38:25AM +0100, Christophe Leroy wrote:
> Le 04/03/2021 à 20:24, Segher Boessenkool a écrit :
> https://github.com/linuxppc/linux/commit/a9a3ed1eff36
> 
> >
> >That is much heavier than needed (an mb()).  You can just put an empty
> >inline asm after a call before a return, and that call cannot be
> >optimised to a sibling call: (the end of a function is an implicit
> >return:)

> In the commit mentionned at the top, it is said:
> 
> The next attempt to prevent compilers from tail-call optimizing
> the last function call cpu_startup_entry(), ... , was to add an empty 
> asm("").
> 
> This current solution was short and sweet, and reportedly, is supported
> by both compilers but we didn't get very far this time: future (LTO?)
> optimization passes could potentially eliminate this,

This is simply not true.  A volatile inline asm (like this is, all
asm without outputs are) is always run on the reel machine exactly like
on the abstract machine.  LTO can not eliminate it, not more than any
other optimisation can.  The compiler makes no assumption about the
constents of the template of an asm, empty or not.

If you are really scared the compiler violates the rules of GCC inline
asm and thinks it knows what "" means, you can write
  asm(";#");
(that is a comment on all supported archs).

> which leads us
> to the third attempt: having an actual memory barrier there which the
> compiler cannot ignore or move around etc.

Why would it not be allowed to delete this, and delete some other asm?

And the compiler *can* move around asm like this.  But the point is,
it has to stay in order with other side effects, so there cannot be a
sibling call here, the call has to remain: any call contains a sequence
point, so side effects cannot be reordered over it, so the call (being
before the asm) cannot be transformed to a tail call.


Segher


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Segher Boessenkool
Hi!

On Thu, Mar 04, 2021 at 02:57:30PM +, Mark Rutland wrote:
> It looks like GCC is happy to give us the function-entry-time FP if we use
> __builtin_frame_address(1),

>From the GCC manual:
 Calling this function with a nonzero argument can have
 unpredictable effects, including crashing the calling program.  As
 a result, calls that are considered unsafe are diagnosed when the
 '-Wframe-address' option is in effect.  Such calls should only be
 made in debugging situations.

It *does* warn (the warning is in -Wall btw), on both powerpc and
aarch64.  Furthermore, using this builtin causes lousy code (it forces
the use of a frame pointer, which we normally try very hard to optimise
away, for good reason).

And, that warning is not an idle warning.  Non-zero arguments to
__builtin_frame_address can crash the program.  It won't on simpler
functions, but there is no real definition of what a simpler function
*is*.  It is meant for debugging, not for production use (this is also
why no one has bothered to make it faster).

On Power it should work, but on pretty much any other arch it won't.

> Unless we can get some strong guarantees from compiler folk such that we
> can guarantee a specific function acts boundary for unwinding (and
> doesn't itself get split, etc), the only reliable way I can think to
> solve this requires an assembly trampoline. Whatever we do is liable to
> need some invasive rework.

You cannot get such a guarantee, other than not letting the compiler
see into the routine at all, like with assembler code (not inline asm,
real assembler code).

The real way forward is to bite the bullet and to no longer pretend you
can do a full backtrace from just the stack contents.  You cannot.


Segher


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Segher Boessenkool
On Thu, Mar 04, 2021 at 09:54:44AM -0800, Nick Desaulniers wrote:
> On Thu, Mar 4, 2021 at 9:42 AM Marco Elver  wrote:
> include/linux/compiler.h:246:
> prevent_tail_call_optimization
> 
> commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")

That is much heavier than needed (an mb()).  You can just put an empty
inline asm after a call before a return, and that call cannot be
optimised to a sibling call: (the end of a function is an implicit
return:)

Instead of:

void g(void);
void f(int x)
if (x)
g();
}

Do:

void g(void);
void f(int x)
if (x)
g();
asm("");
}

This costs no extra instructions, and certainly not something as heavy
as an mb()!  It works without the "if" as well, of course, but with it
it is a more interesting example of a tail call.


Segher


Re: [PATCH v2 1/3] powerpc: sstep: Fix load and update emulation

2021-03-04 Thread Segher Boessenkool
On Wed, Mar 03, 2021 at 10:01:27PM +0530, Naveen N. Rao wrote:
> On 2021/03/01 08:37PM, Segher Boessenkool wrote:
> > > And, r6 always ends up with 0xaea. It changes with the value I put into 
> > > r6 though.
> > 
> > That is exactly the behaviour specified for p8.  0aaa+0040=0aea.
> > 
> > > Granted, this is all up in the air, but it does look like there is more 
> > > going on and the value isn't the EA or the value at the address.
> > 
> > That *is* the EA.  The EA is the address the insn does the access at.
> 
> I'm probably missing something here. 0xaaa is the value I stored at an 
> offset of 64 bytes from the stack pointer (r1 is copied into r6). In the 
> ldu instruction above, the EA is 64(r6), which should translate to 
> r1+64.  The data returned by the load would be 0xaaa, which should be 
> discarded per the description you provided above. So, I would expect to 
> see a 0xc0.. address in r6.

Yes, I misread your code it seems.

> In fact, this looks to be the behavior documented for P9:
> 
> > > Power9 does:
> > >
> > >   Load with Update Instructions (RA = 0)
> > > EA is placed into R0.
> > >   Load with Update Instructions (RA = RT)
> > > The storage operand addressed by EA is accessed. The 
> > > displacement
> > > field is added to the data returned by the load and placed into 
> > > RT.

Yup.  So on what cpu did you test?

Either way, the kernel should not emulate any particular cpu here, I'd
say, esp. since recent cpus do different things for this invalid form.


Segher


Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-02 Thread Segher Boessenkool
On Tue, Mar 02, 2021 at 10:40:03PM +1100, Michael Ellerman wrote:
> >> -- Change the unwinder, if it's possible for ppc32.
> >
> > I don't think it is possible.
> 
> I think this actually is the solution.
> 
> It seems the good architectures have all added support for
> arch_stack_walk(), and we have not.

I have no idea what arch_stack_walk does, but some background info:

PowerPC functions that do save the LR (== the return address), and/or
that set up a new stack frame, do not do this at the start of the
function necessarily (it is a lot faster to postpone this, even if you
always have to do it).  So, in a leaf function it isn't always known if
this has been done (in all callers further up it is always done, of
course).  If you have DWARF unwind info all is fine of course, but you
do not have that in the kernel.

> So I think it's probably on us to update to that new API. Or at least
> update our save_stack_trace() to fabricate an entry using the NIP, as it
> seems that's what callers expect.

This sounds very expensive?  If it is only a debug feature that won't
be used in production that does not matter, but it worries me.


Segher



Re: [PATCH v2 1/3] powerpc: sstep: Fix load and update emulation

2021-03-01 Thread Segher Boessenkool
Hi!

I didn't see this until now, almost a month later, sorry about that :-)

On Thu, Feb 04, 2021 at 01:57:53PM +0530, Naveen N. Rao wrote:
> On 2021/02/03 03:17PM, Segher Boessenkool wrote:
> > Power8 does:
> > 
> >   Load with Update Instructions (RA = 0)
> > EA is placed into R0.
> >   Load with Update Instructions (RA = RT)
> > EA is placed into RT. The storage operand addressed by EA is
> > accessed, but the data returned by the load is discarded.
> 
> I'm actually not seeing that. This is what I am testing with:
>   li  8,0xaaa
>   mr  6,1
>   std 8,64(6)
>   #ldu6,64(6)
>   .long   0xe8c60041
> 
> And, r6 always ends up with 0xaea. It changes with the value I put into 
> r6 though.

That is exactly the behaviour specified for p8.  0aaa+0040=0aea.

> Granted, this is all up in the air, but it does look like there is more 
> going on and the value isn't the EA or the value at the address.

That *is* the EA.  The EA is the address the insn does the access at.


Segher


Re: [PATCH v1 01/15] powerpc/uaccess: Remove __get_user_allowed() and unsafe_op_wrap()

2021-03-01 Thread Segher Boessenkool
On Tue, Mar 02, 2021 at 09:02:54AM +1100, Daniel Axtens wrote:
> Checkpatch does have one check that is relevant:
> 
> CHECK: Macro argument reuse 'p' - possible side-effects?
> #36: FILE: arch/powerpc/include/asm/uaccess.h:482:
> +#define unsafe_get_user(x, p, e) do {
> \
> + if (unlikely(__get_user_nocheck((x), (p), sizeof(*(p)), false)))\
> + goto e; \
> +} while (0)

sizeof (of something other than a VLA) does not evaluate its operand.
The checkpatch warning is incorrect (well, it does say "possible" --
it just didn't find a possible problem here).

You can write
  bla = sizeof *p++;
and p is *not* incremented.


Segher


Re: [RFC PATCH 8/8] powerpc/64/asm: don't reassign labels

2021-02-25 Thread Segher Boessenkool
On Thu, Feb 25, 2021 at 02:10:06PM +1100, Daniel Axtens wrote:
> The assembler really does not like us reassigning things to the same
> label:
> 
> :7:9: error: invalid reassignment of non-absolute variable 
> 'fs_label'
> 
> This happens across a bunch of platforms:
> https://github.com/ClangBuiltLinux/linux/issues/1043
> https://github.com/ClangBuiltLinux/linux/issues/1008
> https://github.com/ClangBuiltLinux/linux/issues/920
> https://github.com/ClangBuiltLinux/linux/issues/1050
> 
> There is no hope of getting this fixed in LLVM, so if we want to build
> with LLVM_IAS, we need to hack around it ourselves.
> 
> For us the big problem comes from this:
> 
> \#define USE_FIXED_SECTION(sname) \
>   fs_label = start_##sname;   \
>   fs_start = sname##_start;   \
>   use_ftsec sname;
> 
> \#define USE_TEXT_SECTION()
>   fs_label = start_text;  \
>   fs_start = text_start;  \
>   .text
> 
> and in particular fs_label.

The "Setting Symbols" super short chapter reads:

"A symbol can be given an arbitrary value by writing a symbol, followed
by an equals sign '=', followed by an expression.  This is equivalent
to using the '.set' directive."

And ".set" has

"Set the value of SYMBOL to EXPRESSION.  This changes SYMBOL's value and
type to conform to EXPRESSION.  If SYMBOL was flagged as external, it
remains flagged.

You may '.set' a symbol many times in the same assembly provided that
the values given to the symbol are constants.  Values that are based on
expressions involving other symbols are allowed, but some targets may
restrict this to only being done once per assembly.  This is because
those targets do not set the addresses of symbols at assembly time, but
rather delay the assignment until a final link is performed.  This
allows the linker a chance to change the code in the files, changing the
location of, and the relative distance between, various different
symbols.

If you '.set' a global symbol, the value stored in the object file is
the last value stored into it."

So this really should be fixed in clang: it is basic assembler syntax.


Segher


Re: [RFC PATCH 7/8] powerpc/purgatory: drop .machine specifier

2021-02-25 Thread Segher Boessenkool
On Thu, Feb 25, 2021 at 02:10:05PM +1100, Daniel Axtens wrote:
> It's ignored by future versions of llvm's integrated assembler (by not -11).
> I'm not sure what it does for us in gas.

It enables all insns that exist on 620 (the first 64-bit PowerPC CPU).

> --- a/arch/powerpc/purgatory/trampoline_64.S
> +++ b/arch/powerpc/purgatory/trampoline_64.S
> @@ -12,7 +12,7 @@
>  #include 
>  #include 
>  
> - .machine ppc64
> +//upgrade clang, gets ignored.machine ppc64

Why delete it if it is ignored?  Why add a cryptic comment?


Segher


Re: [RFC PATCH 5/8] poweprc/lib/quad: Provide macros for lq/stq

2021-02-25 Thread Segher Boessenkool
On Thu, Feb 25, 2021 at 02:10:03PM +1100, Daniel Axtens wrote:
> +#define PPC_RAW_LQ(t, a, dq) (0xe000 | ___PPC_RT(t) | 
> ___PPC_RA(a) | (((dq) & 0xfff) << 3))

Please keep the operand order the same as for the assembler insns?  So
t,dq,a here.

It should be  ((dq) & 0x0fff) << 4)  .

> +#define PPC_RAW_STQ(t, a, ds)(0xf802 | ___PPC_RT(t) | 
> ___PPC_RA(a) | (((ds) & 0xfff) << 3))

And t,ds,a here.  (But it should use "s" instead of "t" preferably, and
use ___PPC_RS, because it is a source field, not a target).

It should be  ((ds) & 0x3fff) << 2)  as well.


Segher


Re: [RFC PATCH 4/8] powerpc/ppc_asm: use plain numbers for registers

2021-02-25 Thread Segher Boessenkool
On Thu, Feb 25, 2021 at 02:10:02PM +1100, Daniel Axtens wrote:
> This is dumb but makes the llvm integrated assembler happy.
> https://github.com/ClangBuiltLinux/linux/issues/764

> -#define  r0  %r0

> +#define  r0  0

This is a big step back (compare 9a13a524ba37).

If you use a new enough GAS, you can use the -mregnames option and just
say "r0" directly (so not define it at all, or define it to itself).

===
addi 3,3,3
addi r3,r3,3
addi %r3,%r3,3

addi 3,3,3
addi r3,r3,r3
addi %r3,%r3,%r3
===

$ as t.s -o t.o -mregnames
t.s: Assembler messages:
t.s:6: Warning: invalid register expression
t.s:7: Warning: invalid register expression


Many people do not like bare numbers.  It is a bit like not wearing
seatbelts (but so is all assembler code really: you just have to pay
attention).  A better argument is that it is harder to read for people
not used to assembler code like this.

We used to have "#define r0 0" etc., and that was quite problematic.
Like that "addi r3,r3,r3" example, but also, people wrote "r0" where
only a plain 0 is allowed (like in "lwzx r3,0,r3": "r0" would be
misleading there!)


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> >On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:
> >>powerpc BUG_ON() is based on using twnei or tdnei instruction,
> >>which obliges gcc to format the condition into a 0 or 1 value
> >>in a register.
> >
> >Huh?  Why is that?
> >
> >Will it work better if this used __builtin_trap?  Or does the kernel only
> >detect very specific forms of trap instructions?
> 
> We already made a try with __builtin_trap() 1,5 year ago, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/
> 
> The main problems encountered are:
> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC 
> considers it as noreturn. Is there any workaround ?

A trap is noreturn by definition:

 -- Built-in Function: void __builtin_trap (void)
 This function causes the program to exit abnormally.

> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify 
> the source file and line corresponding to the trap. How can that be done 
> with __builtin_trap() ?

The DWARF debug info should be sufficient.  Perhaps you can post-process
some way?

You can create a trap that falls through yourself (by having a trap-on
condition with a condition that is always true, but make the compiler
not see that).  This isn't efficient though.

Could you file a feature request (in bugzilla)?  It is probably useful
for generic code as well, but we could implement this for powerpc only
if needed.


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 01:26:12PM +0100, Christophe Leroy wrote:
> >What PowerPC cpus implement branch folding?  I know none.
> 
> Extract from powerpc mpc8323 reference manual:

> — Zero-cycle branch capability (branch folding)

Yeah, this is not what is traditionally called branch folding (which
stores the instruction being branched to in some cache, originally the
instruction cache itself; somewhat similar (but different) to the BTIC
on 750).  Overloaded terminology :-)

6xx/7xx CPUs had the branch execution unit in the frontend, and it would
not issue a branch at all if it could be resolved then already.  Power4
and later predict all branches, and most are not issued at all (those
that do need to be executed, like bdnz, are).  At completion time it is
checked if the prediction was correct (and corrective action is taken if
not).


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> > As modern powerpc implement branch folding, that's even more efficient.

Ah, it seems you mean what Arm calls branch folding.  Sure, power4
already did that, and this has not changed.

> I think POWER will speculate conditional traps as non faulting always
> so it should be just as good if not better than the branch.

Right, these are not branch instructions, so are not branch predicted;
all trap instructions are assumed to fall through, like all other
non-branch instructions.


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> It would be nice if we could have a __builtin_trap_if that gcc would use 
> conditional traps with, (and which never assumes following code is 
> unreachable even for constant true, so we can use it with WARN and put 
> explicit unreachable for BUG).

It automatically does that with just __builtin_trap, see my other mail :-)


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.

Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?

> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

That is many more instructions than ideal.

> As modern powerpc implement branch folding, that's even more efficient.

What PowerPC cpus implement branch folding?  I know none.

Some example code generated via __builtin_trap:

void trap(void) { __builtin_trap(); }
void trap_if_0(int x) { if (x == 0) __builtin_trap(); }
void trap_if_not_0(int x) { if (x != 0) __builtin_trap(); }

-m64:

trap:
trap
trap_if_0:
tdeqi 3,0
blr
trap_if_not_0:
tdnei 3,0
blr

-m32:

trap:
trap
trap_if_0:
tweqi 3,0
blr
trap_if_not_0:
twnei 3,0
blr


Segher


Re: [PATCH v5 20/22] powerpc/syscall: Avoid storing 'current' in another pointer

2021-02-09 Thread Segher Boessenkool
On Tue, Feb 09, 2021 at 12:36:20PM +1000, Nicholas Piggin wrote:
> What if you did this?

> +static inline struct task_struct *get_current(void)
> +{
> + register struct task_struct *task asm ("r2");
> +
> + return task;
> +}

Local register asm variables are *only* guaranteed to live in that
register as operands to an asm.  See
  
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables
("The only supported use" etc.)

You can do something like

static inline struct task_struct *get_current(void)
{
register struct task_struct *task asm ("r2");

asm("" : "+r"(task));

return task;
}

which makes sure that "task" actually is in r2 at the point of that asm.


Segher


Re: [PATCH v2 1/3] powerpc: sstep: Fix load and update emulation

2021-02-03 Thread Segher Boessenkool
On Wed, Feb 03, 2021 at 03:19:09PM +0530, Naveen N. Rao wrote:
> On 2021/02/03 12:08PM, Sandipan Das wrote:
> > The Power ISA says that the fixed-point load and update
> > instructions must neither use R0 for the base address (RA)
> > nor have the destination (RT) and the base address (RA) as
> > the same register. In these cases, the instruction is
> > invalid.

> > However, the following behaviour is observed using some
> > invalid opcodes where RA = RT.
> > 
> > An userspace program using an invalid instruction word like
> > 0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
> > getting terminated abruptly. The instruction performs the
> > load operation but does not write the effective address to
> > the base address register. 
> 
> While the processor (p8 in my test) doesn't seem to be throwing an 
> exception, I don't think it is necessarily loading the value. Qemu 
> throws an exception though. It's probably best to term the behavior as 
> being undefined.

Power8 does:

  Load with Update Instructions (RA = 0)
EA is placed into R0.
  Load with Update Instructions (RA = RT)
EA is placed into RT. The storage operand addressed by EA is
accessed, but the data returned by the load is discarded.

Power9 does:

  Load with Update Instructions (RA = 0)
EA is placed into R0.
  Load with Update Instructions (RA = RT)
The storage operand addressed by EA is accessed. The displacement
field is added to the data returned by the load and placed into RT.

Both UMs also say

  Invalid Forms
In general, the POWER9 core handles invalid forms of instructions in
the manner that is most convenient for the particular case (within
the scope of meeting the boundedly-undefined definition described in
the Power ISA). This document specifies the behavior for these
cases.  However, it is not recommended that software or other system
facilities make use of the POWER9 behavior in these cases because
such behavior might be different in another processor that
implements the Power ISA.

(or POWER8 instead of POWER9 of course).  Always complaining about most
invalid forms seems wise, certainly if not all recent CPUs behave the
same :-)


Segher


Re: [PATCH v4 11/23] powerpc/syscall: Rename syscall_64.c into syscall.c

2021-02-02 Thread Segher Boessenkool
On Tue, Feb 02, 2021 at 04:38:31PM +1000, Nicholas Piggin wrote:
> > So to avoid confusion, I'll rename it. But I think "interrupt" is maybe not 
> > the right name. An 
> > interrupt most of the time refers to IRQ.
> 
> That depends what you mean by interrupt and IRQ.

In the PowerPC architecture, an exception is an abnormal condition, and
that can often cause an interrupt.  What Christophe colloquially calls
an "IRQ" here is called an external exception c.q. external interrupt.

> But if you say irq it's more likely to mean
> a device interrupt, and "interrupt" usually refres to the asynch
> ones.

Power talks about "instruction-caused interrupts", for one aspect of the
difference here; and "precise" / "imprecise" interrupts for another.

> So it's really fine to use the proper arch-specific names for things
> in arch code. I'm trying to slowly change names from exception to
> interrupt.

Thanks :-)

> > For me system call is not an interrupt in the way it 
> > doesn't unexpectedly interrupt a program flow. In powerpc manuals it is 
> > generally called exceptions, 
> > no I'm more inclined to call it exception.c
> 
> Actually that's backwards. Powerpc manuals (at least the one I look at) 
> calls them all interrupts including system calls, and also the system
> call interrupt is actually the only one that doesn't appear to be
> associated with an exception.

Yeah.  You could easily make such an exception, which is set when you
execute a system call instruction, and cleared when the interrupt is
taken, of course; but the architecture doesn't.

> And on the other hand you can deal with exceptions in some cases without
> taking an interrupt at all. For example if you had MSR[EE]=0 you could
> change the decrementer or execute msgclr or change HMER SPR etc to clear
> various exceptions without ever taking the interrupt.

A well-known example is the exception bits in the FPSCR, which do not
cause an interrupt unless the corresponding enable bits are also set.


Segher


Re: [PATCH] selftests/powerpc: Only test lwm/stmw on big endian

2021-01-19 Thread Segher Boessenkool
Hi!

On Tue, Jan 19, 2021 at 03:18:00PM +1100, Michael Ellerman wrote:
> Newer binutils (>= 2.36) refuse to assemble lmw/stmw when building in
> little endian mode. That breaks compilation of our alignment handler
> test:
> 
>   /tmp/cco4l14N.s: Assembler messages:
>   /tmp/cco4l14N.s:1440: Error: `lmw' invalid when little-endian
>   /tmp/cco4l14N.s:1814: Error: `stmw' invalid when little-endian
>   make[2]: *** [../../lib.mk:139: 
> /output/kselftest/powerpc/alignment/alignment_handler] Error 1
> 
> These tests do pass on little endian machines, as the kernel will
> still emulate those instructions even when running little
> endian (which is arguably a kernel bug).

The opposite: in older ISAs it is *required* to.  On all very old ISA
versions, and when not on the Server Environment on everything before
ISA 2.07.

Many older implementations did an alignment interrupt, but that was an
implementation detail (they could still be compliant with proper system
software support, e.g. kernel emulation handlers).  Nowadays that
interrupt is required, so you can still support it like that.

(The patch is fine of course.)


Segher


Re: [PATCH v2] powerpc: Handle .text.{hot, unlikely}.* in linker script

2021-01-16 Thread Segher Boessenkool
Hi!

Very late of course, and the patch is fine, but:

On Mon, Jan 04, 2021 at 01:59:53PM -0700, Nathan Chancellor wrote:
> Commit eff8728fe698 ("vmlinux.lds.h: Add PGO and AutoFDO input
> sections") added ".text.unlikely.*" and ".text.hot.*" due to an LLVM
> change [1].
> 
> After another LLVM change [2], these sections are seen in some PowerPC
> builds, where there is a orphan section warning then build failure:
> 
> $ make -skj"$(nproc)" \
>ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- LLVM=1 O=out \
>distclean powernv_defconfig zImage.epapr
> ld.lld: warning: kernel/built-in.a(panic.o):(.text.unlikely.) is being placed 
> in '.text.unlikely.'

Is the section really called ".text.unlikely.", i.e. the name ending in
a dot?  How very unusual, is there some bug elsewhere?


Segher


Re: [PATCH] powerpc:Don't print raw EIP/LR hex values in dump_stack() and show_regs()

2020-12-22 Thread Segher Boessenkool
On Tue, Dec 22, 2020 at 09:45:03PM +0800, Xiaoming Ni wrote:
> On 2020/12/22 1:12, Segher Boessenkool wrote:
> >On Mon, Dec 21, 2020 at 04:42:23PM +, David Laight wrote:
> >>From: Segher Boessenkool
> >>>Sent: 21 December 2020 16:32
> >>>
> >>>On Mon, Dec 21, 2020 at 04:17:21PM +0100, Christophe Leroy wrote:
> >>>>Le 21/12/2020 à 04:27, Xiaoming Ni a écrit :
> >>>>>Since the commit 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR
> >>>>>infrastructure"), the powerpc system is ready to support KASLR.
> >>>>>To reduces the risk of invalidating address randomization, don't print 
> >>>>>the
> >>>>>EIP/LR hex values in dump_stack() and show_regs().
> >>>
> >>>>I think your change is not enough to hide EIP address, see below a dump
> >>>>with you patch, you get "Faulting instruction address: 0xc03a0c14"
> >>>
> >>>As far as I can see the patch does nothing to the GPR printout.  Often
> >>>GPRs contain code addresses.  As one example, the LR is moved via a GPR
> >>>(often GPR0, but not always) for storing on the stack.
> >>>
> >>>So this needs more work.
> >>
> >>If the dump_stack() is from an oops you need the real EIP value
> >>on order to stand any chance of making headway.
> >
> >Or at least the function name + offset, yes.
> >
> When the system is healthy, only symbols and offsets are printed,
> Output address and symbol + offset when the system is dying
> Does this meet both debugging and security requirements?

If you have the vmlinux, sym+off is enough to find what instruction
caused the crash.

It does of course not give all the information you get in a crash dump
with all the registers, so it does hinder debugging a bit.  This is a
tradeoff.

Most debugging will need xmon or similar (or printf-style debugging)
anyway; and otoh the register dump will render KASLR largely
ineffective.

> For example:
> 
> +static void __show_regs_ip_lr(const char *flag, unsigned long addr)
> +{
> + if (system_going_down()) { /* panic oops reboot */
> + pr_cont("%s["REG"] %pS", flag, addr, (void *)addr);
> + } else {
> + pr_cont("%s%pS", flag, (void *)addr);
> + }
> +}

*If* you are certain the system goes down immediately, and you are also
certain this information will not help defeat ASLR after a reboot, you
could just print whatever, sure.

Otherwise, you only want to show some very few registers.  Or, make sure
no attackers can ever see these dumps (which is hard, many systems trust
all (local) users with it!)  Which means we first will need some very
different patches, before any of this can be much useful :-(


Segher


Re: [PATCH] powerpc:Don't print raw EIP/LR hex values in dump_stack() and show_regs()

2020-12-21 Thread Segher Boessenkool
On Mon, Dec 21, 2020 at 04:42:23PM +, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 21 December 2020 16:32
> > 
> > On Mon, Dec 21, 2020 at 04:17:21PM +0100, Christophe Leroy wrote:
> > > Le 21/12/2020 à 04:27, Xiaoming Ni a écrit :
> > > >Since the commit 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR
> > > >infrastructure"), the powerpc system is ready to support KASLR.
> > > >To reduces the risk of invalidating address randomization, don't print 
> > > >the
> > > >EIP/LR hex values in dump_stack() and show_regs().
> > 
> > > I think your change is not enough to hide EIP address, see below a dump
> > > with you patch, you get "Faulting instruction address: 0xc03a0c14"
> > 
> > As far as I can see the patch does nothing to the GPR printout.  Often
> > GPRs contain code addresses.  As one example, the LR is moved via a GPR
> > (often GPR0, but not always) for storing on the stack.
> > 
> > So this needs more work.
> 
> If the dump_stack() is from an oops you need the real EIP value
> on order to stand any chance of making headway.

Or at least the function name + offset, yes.

> Otherwise you might just as well just print 'borked - tough luck'.

Yes.  ASLR is a house of cards.  But that isn't constructive wrt this
patch :-)


Segher


Re: [PATCH] powerpc:Don't print raw EIP/LR hex values in dump_stack() and show_regs()

2020-12-21 Thread Segher Boessenkool
On Mon, Dec 21, 2020 at 04:17:21PM +0100, Christophe Leroy wrote:
> Le 21/12/2020 à 04:27, Xiaoming Ni a écrit :
> >Since the commit 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR
> >infrastructure"), the powerpc system is ready to support KASLR.
> >To reduces the risk of invalidating address randomization, don't print the
> >EIP/LR hex values in dump_stack() and show_regs().

> I think your change is not enough to hide EIP address, see below a dump 
> with you patch, you get "Faulting instruction address: 0xc03a0c14"

As far as I can see the patch does nothing to the GPR printout.  Often
GPRs contain code addresses.  As one example, the LR is moved via a GPR
(often GPR0, but not always) for storing on the stack.

So this needs more work.


Segher


  1   2   3   4   5   6   7   8   9   10   >