Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Dave Martin
On Fri, Apr 13, 2018 at 07:50:17PM +0100, Russell King - ARM Linux wrote:
> On Fri, Apr 13, 2018 at 07:35:38PM +0100, Dave Martin wrote:
> > If that's the case though, I don't see how a userspace testsuite is
> > hitting this code path.  Maybe I've misunderstood the context of this
> > thread.
> 
> It isn't hitting this exact case.
> 
> The userspace testsuite is hitting an entirely different case:
> 
>   kill(getpid(), SIGFPE);
> 
> As one expects, this generates a SIGFPE to the current process, which
> then inspects the siginfo structure.  Being a userspace generated
> signal, si_code is set to SI_USER, which has the value 0.
> 
> With FPE_FIXME defined to zero, as Eric has done:
> 
> enum siginfo_layout siginfo_layout(int sig, int si_code)
> {
> enum siginfo_layout layout = SIL_KILL;
> if ((si_code > SI_USER) && (si_code < SI_KERNEL)) {
> ...
> } else {
> ...
> #ifdef FPE_FIXME
> if ((sig == SIGFPE) && (si_code == FPE_FIXME))
> layout = SIL_FAULT;
> #endif
> }
> return layout;
> }
> 
> This causes siginfo_layout() to return SIL_FAULT for this userspace
> generated signal, rather than the correct SIL_KILL.
> 
> This affects which fields we copy to userspace.
> 
> SI_USER is defined to pass si_pid and si_uid to the userspace process,
> which on ARM are the first two consecutive 32-bit quantities in the
> union, which is done when siginfo_layout() returns SIL_KILL.  However,
> when SIL_FAULT is returned, we only copy si_addr in the union, which
> on ARM is just one 32-bit quantity.
> 
> Consequently, userspace sees a correct value for si_pid, and si_uid
> remains set to whatever was there in userspace.  In the case of the
> strace program, that's zero.  This means if you run the strace
> testsuite as root, the problem doesn't appear, but if you run it as
> a non-root user, it will.
> 
> So, the testsuite case has little to do with the behaviour of the VFP
> handling - it's to do with the behaviour of the kernel's signal handling.

Oh, right.  So, going back to the unhandled VFP bounce question,
is it reasonable for that to be a SIGKILL?  That avoids the question
of what si_code userspace should see, because userspace doesn't get
to see siginfo at all in that case: it's dead.

Or do we hit this in real situations that we want userspace to bail out
of more gracefully?

Cheers
---Dave


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Dave Martin
On Fri, Apr 13, 2018 at 11:23:36AM -0700, Linus Torvalds wrote:
> On Fri, Apr 13, 2018 at 10:54 AM, Russell King - ARM Linux
>  wrote:
> >
> > FPE_FLTINV means "floating point invalid operation".  Does it really
> > cover the case where hardware has failed, or is it intended to cover
> > the case where userspace did something wrong and asked for an invalid
> > operation from the FP hardware?
> 
> Note that the number of people who actually look at the si_code is
> approximately zero.
> 
> But the ones that _do_ check the si_code are certainly not going to
> check it against a new code that they don't know about.
> 
> I suspect that if you start searching for FLT_xyz occurrences in code,
> approximately 100% of them are from the kernel code that generates
> them, not from any actual users.
> 
> So I'd be very surprised if you can find *anybody* who cares about
> that exact value (with the possible exceptions of test-suites).
> 
> Sadly, google code-search is no more. It was useful for things like that.

I've found https://codesearch.debian.net/ useful for digging into this
kind of question, though it tends to throw up a lot of false positives.

Most uses I've seen do nothing more than use the FPE_xyz value to
format diagnostic messages while dying.  I struggled to find code that
made a meaningful functional decision based on the value, though that's
not proof...

Cheers
---Dave


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Dave Martin
On Fri, Apr 13, 2018 at 06:54:08PM +0100, Russell King - ARM Linux wrote:
> On Fri, Apr 13, 2018 at 06:08:28PM +0100, Dave Martin wrote:
> > On Fri, Apr 13, 2018 at 09:33:17AM -0700, Linus Torvalds wrote:
> > > On Fri, Apr 13, 2018 at 2:42 AM, Russell King - ARM Linux
> > >  wrote:
> > > >
> > > > Yes, it does solve the problem at hand with strace - the exact patch I
> > > > tested against 4.16 is below.
> > > 
> > > Ok, good.
> > > 
> > > > However, FPE_FLTUNK is not defined in older kernels, so while we can
> > > > fix it this way for the current merge window, that doesn't help 4.16.
> > > 
> > > I wonder if we should even bother with FPE_FLTUNK.
> > > 
> > > I suspect we might as well use FPE_FLTINV, I suspect, and not have
> > > this complexity at all. That case is not worth worrying about, since
> > > it's a "this shouldn't happen anyway" and the *real* reason will be in
> > > the kernel logs due to vfs_panic().
> > > 
> > > So it's not like this is something that the user should ever care
> > > about the si_code about.
> > 
> > Ack, my intended meaning for FPE_FLTUNK is that the fp exception is
> > either spurious or we can't tell easily (or possibly at all) which
> > FPE_XXX should be returned.  It's up to userspace to figure it out
> > if it really cares.  Previously we were accidentally returning SI_USER
> > in si_code for arm64.
> > 
> > This case on arm looks like a more serious error for which FPE_FLTINV
> > may be more appropriate anyway.
> 
> No.  The cases where we get to this point are:
> 
> 1. A trap concerning a coprocessor register transfer instruction (iow, move
>between a VFP register and ARM register.)
> 2. A trap concerning a coprocessor register load or save instruction.
> 
> (In both of these, "concerning" means that the VFP hardware provides
> such an instruction as the reason for the fault, *not* that it is the
> faulting instruction.)
> 
> 3. A combination of the exception bits (EX and DEX) on certain VFP
>implementations.
> 
> All of these can be summarised as "the hardware went wrong in some way"
> rather than "the user program did something wrong."

Although my understanding of VFP bounces is a bit hazy, I think this is
broadly in line with my assumptions.

> FPE_FLTINV means "floating point invalid operation".  Does it really
> cover the case where hardware has failed, or is it intended to cover
> the case where userspace did something wrong and asked for an invalid
> operation from the FP hardware?

So, there's an argument that FPE_FLTINV is not really correct.  My
rationale was that there is nothing correct that we can return, and
FPE_FLTINV may be no worse than the alternatives.

If we can only hit this case as the result of a hardware failure
or kernel bug though, should this be delivered as SIGKILL instead?

That's the approach I eventually followed for various exceptions
on arm64 that were theoretically delivered to userspace with si_code==0,
but really should be impossible unless and kernel and/or hardware
is buggy.

If that's the case though, I don't see how a userspace testsuite is
hitting this code path.  Maybe I've misunderstood the context of this
thread.

Cheers
---Dave


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Dave Martin
On Fri, Apr 13, 2018 at 09:33:17AM -0700, Linus Torvalds wrote:
> On Fri, Apr 13, 2018 at 2:42 AM, Russell King - ARM Linux
>  wrote:
> >
> > Yes, it does solve the problem at hand with strace - the exact patch I
> > tested against 4.16 is below.
> 
> Ok, good.
> 
> > However, FPE_FLTUNK is not defined in older kernels, so while we can
> > fix it this way for the current merge window, that doesn't help 4.16.
> 
> I wonder if we should even bother with FPE_FLTUNK.
> 
> I suspect we might as well use FPE_FLTINV, I suspect, and not have
> this complexity at all. That case is not worth worrying about, since
> it's a "this shouldn't happen anyway" and the *real* reason will be in
> the kernel logs due to vfs_panic().
> 
> So it's not like this is something that the user should ever care
> about the si_code about.

Ack, my intended meaning for FPE_FLTUNK is that the fp exception is
either spurious or we can't tell easily (or possibly at all) which
FPE_XXX should be returned.  It's up to userspace to figure it out
if it really cares.  Previously we were accidentally returning SI_USER
in si_code for arm64.

This case on arm looks like a more serious error for which FPE_FLTINV
may be more appropriate anyway.

[...]

Cheers
---Dave


Re: [PATCH 00/32] docs/vm: convert to ReST format

2018-04-13 Thread Matthew Wilcox
On Fri, Apr 13, 2018 at 01:55:51PM -0600, Jonathan Corbet wrote:
> > I believe that keeping the mm docs together will give better visibility of
> > what (little) mm documentation we have and will make the updates easier.
> > The documents that fit well into a certain topic could be linked there. For
> > instance:
> 
> ...but this sounds like just the opposite...?  
> 
> I've had this conversation with folks in a number of subsystems.
> Everybody wants to keep their documentation together in one place - it's
> easier for the developers after all.  But for the readers I think it's
> objectively worse.  It perpetuates the mess that Documentation/ is, and
> forces readers to go digging through all kinds of inappropriate material
> in the hope of finding something that tells them what they need to know.
> 
> So I would *really* like to split the documentation by audience, as has
> been done for a number of other kernel subsystems (and eventually all, I
> hope).
> 
> I can go ahead and apply the RST conversion, that seems like a step in
> the right direction regardless.  But I sure hope we don't really have to
> keep it as an unorganized jumble of stuff...

I've started on Documentation/core-api/memory.rst which covers just
memory allocation.  So far it has the Overview and GFP flags sections
written and an outline for 'The slab allocator', 'The page allocator',
'The vmalloc allocator' and 'The page_frag allocator'.  And typing this
up, I realise we need a 'The percpu allocator'.  I'm thinking that this
is *not* the right document for the DMA memory allocators (although it
should link to that documentation).

I suspect the existing Documentation/vm/ should probably stay as an
unorganised jumble of stuff.  Developers mostly talking to other MM
developers.  Stuff that people outside the MM fraternity should know
about needs to be centrally documented.  By all means convert it to
ReST ... I don't much care, and it may make it easier to steal bits
or link to it from the organised documentation.


Re: [PATCH 00/32] docs/vm: convert to ReST format

2018-04-13 Thread Jonathan Corbet
Sorry for the silence, I'm pedaling as fast as I can, honest...

On Sun, 1 Apr 2018 09:38:58 +0300
Mike Rapoport  wrote:

> My thinking was to start with mechanical RST conversion and then to start
> working on the contents and ordering of the documentation. Some of the
> existing files, e.g. ksm.txt, can be moved as is into the appropriate
> places, others, like transhuge.txt should be at least split into admin/user
> and developer guides.
> 
> Another problem with many of the existing mm docs is that they are rather
> developer notes and it wouldn't be really straight forward to assign them
> to a particular topic.

All this sounds good.

> I believe that keeping the mm docs together will give better visibility of
> what (little) mm documentation we have and will make the updates easier.
> The documents that fit well into a certain topic could be linked there. For
> instance:

...but this sounds like just the opposite...?  

I've had this conversation with folks in a number of subsystems.
Everybody wants to keep their documentation together in one place - it's
easier for the developers after all.  But for the readers I think it's
objectively worse.  It perpetuates the mess that Documentation/ is, and
forces readers to go digging through all kinds of inappropriate material
in the hope of finding something that tells them what they need to know.

So I would *really* like to split the documentation by audience, as has
been done for a number of other kernel subsystems (and eventually all, I
hope).

I can go ahead and apply the RST conversion, that seems like a step in
the right direction regardless.  But I sure hope we don't really have to
keep it as an unorganized jumble of stuff...

Thanks,

jon


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Linus Torvalds
On Fri, Apr 13, 2018 at 11:45 AM, Dave Martin  wrote:
>
> Most uses I've seen do nothing more than use the FPE_xyz value to
> format diagnostic messages while dying.  I struggled to find code that
> made a meaningful functional decision based on the value, though that's
> not proof...

Yeah. I've seen code that cares about SIGFPE deeply, but it's almost
invariably about some emulated environment (eg Java VM, or CPU
emulation).

And the siginfo data is basically never good enough for those
environments anyway on its own, so they will go and look at the actual
instruction that caused the fault and the register state instead,
because they need *all* the information.

The cases that use si_code are the ones that just trapped signals in
order to give a more helpful abort message.

So I could certainly imagine that si_code is actually used by somebody
who then decides to actuall act differently on it, but aside from
perhaps printing out a different message, it sounds far-fetched.

Linus


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Russell King - ARM Linux
On Fri, Apr 13, 2018 at 07:35:38PM +0100, Dave Martin wrote:
> If that's the case though, I don't see how a userspace testsuite is
> hitting this code path.  Maybe I've misunderstood the context of this
> thread.

It isn't hitting this exact case.

The userspace testsuite is hitting an entirely different case:

kill(getpid(), SIGFPE);

As one expects, this generates a SIGFPE to the current process, which
then inspects the siginfo structure.  Being a userspace generated
signal, si_code is set to SI_USER, which has the value 0.

With FPE_FIXME defined to zero, as Eric has done:

enum siginfo_layout siginfo_layout(int sig, int si_code)
{
enum siginfo_layout layout = SIL_KILL;
if ((si_code > SI_USER) && (si_code < SI_KERNEL)) {
...
} else {
...
#ifdef FPE_FIXME
if ((sig == SIGFPE) && (si_code == FPE_FIXME))
layout = SIL_FAULT;
#endif
}
return layout;
}

This causes siginfo_layout() to return SIL_FAULT for this userspace
generated signal, rather than the correct SIL_KILL.

This affects which fields we copy to userspace.

SI_USER is defined to pass si_pid and si_uid to the userspace process,
which on ARM are the first two consecutive 32-bit quantities in the
union, which is done when siginfo_layout() returns SIL_KILL.  However,
when SIL_FAULT is returned, we only copy si_addr in the union, which
on ARM is just one 32-bit quantity.

Consequently, userspace sees a correct value for si_pid, and si_uid
remains set to whatever was there in userspace.  In the case of the
strace program, that's zero.  This means if you run the strace
testsuite as root, the problem doesn't appear, but if you run it as
a non-root user, it will.

So, the testsuite case has little to do with the behaviour of the VFP
handling - it's to do with the behaviour of the kernel's signal handling.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


[PATCH v2] powerpc/sparse: fix plain integer as NULL pointer warning

2018-04-13 Thread Mathieu Malaterre
Trivial fix to remove the following sparse warnings:

  arch/powerpc/kernel/module_32.c:112:74: warning: Using plain integer as NULL 
pointer
  arch/powerpc/kernel/module_32.c:117:74: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:1155:28: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:1230:20: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:1385:36: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:1752:23: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:2084:19: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:2110:32: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:2167:19: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:2183:19: warning: Using plain integer as NULL 
pointer
  drivers/macintosh/via-pmu.c:277:20: warning: Using plain integer as NULL 
pointer
  arch/powerpc/platforms/powermac/setup.c:155:67: warning: Using plain integer 
as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:247:27: warning: Using plain integer 
as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:249:27: warning: Using plain integer 
as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:252:37: warning: Using plain integer 
as NULL pointer
  arch/powerpc/mm/tlb_hash32.c:127:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/mm/tlb_hash32.c:148:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/mm/tlb_hash32.c:44:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/mm/tlb_hash32.c:57:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/mm/tlb_hash32.c:87:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/kernel/btext.c:160:31: warning: Using plain integer as NULL 
pointer
  arch/powerpc/kernel/btext.c:167:22: warning: Using plain integer as NULL 
pointer
  arch/powerpc/kernel/btext.c:274:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/kernel/btext.c:285:31: warning: Using plain integer as NULL 
pointer
  arch/powerpc/include/asm/hugetlb.h:204:16: warning: Using plain integer as 
NULL pointer
  arch/powerpc/mm/ppc_mmu_32.c:170:21: warning: Using plain integer as NULL 
pointer
  arch/powerpc/platforms/powermac/pci.c:1227:23: warning: Using plain integer 
as NULL pointer
  arch/powerpc/platforms/powermac/pci.c:65:24: warning: Using plain integer as 
NULL pointer

Also use `--fix` command line option from `script/checkpatch --strict` to
remove the following:

  CHECK: Comparison to NULL could be written "!dispDeviceBase"
  #72: FILE: arch/powerpc/kernel/btext.c:160:
  + if (dispDeviceBase == NULL)

  CHECK: Comparison to NULL could be written "!vbase"
  #80: FILE: arch/powerpc/kernel/btext.c:167:
  + if (vbase == NULL)

  CHECK: Comparison to NULL could be written "!base"
  #89: FILE: arch/powerpc/kernel/btext.c:274:
  + if (base == NULL)

  CHECK: Comparison to NULL could be written "!dispDeviceBase"
  #98: FILE: arch/powerpc/kernel/btext.c:285:
  + if (dispDeviceBase == NULL)

  CHECK: Comparison to NULL could be written "strstr"
  #117: FILE: arch/powerpc/kernel/module_32.c:117:
  + if (strstr(secstrings + sechdrs[i].sh_name, ".debug") != NULL)

  CHECK: Comparison to NULL could be written "!Hash"
  #130: FILE: arch/powerpc/mm/ppc_mmu_32.c:170:
  + if (Hash == NULL)

  CHECK: Comparison to NULL could be written "Hash"
  #143: FILE: arch/powerpc/mm/tlb_hash32.c:44:
  + if (Hash != NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #152: FILE: arch/powerpc/mm/tlb_hash32.c:57:
  + if (Hash == NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #161: FILE: arch/powerpc/mm/tlb_hash32.c:87:
  + if (Hash == NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #170: FILE: arch/powerpc/mm/tlb_hash32.c:127:
  + if (Hash == NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #179: FILE: arch/powerpc/mm/tlb_hash32.c:148:
  + if (Hash == NULL) {

  ERROR: space required after that ';' (ctx:VxV)
  #192: FILE: arch/powerpc/platforms/powermac/pci.c:65:
  + for (; node != NULL;node = node->sibling) {

  CHECK: Comparison to NULL could be written "node"
  #192: FILE: arch/powerpc/platforms/powermac/pci.c:65:
  + for (; node != NULL;node = node->sibling) {

  CHECK: Comparison to NULL could be written "!region"
  #201: FILE: arch/powerpc/platforms/powermac/pci.c:1227:
  + if (region == NULL)

  CHECK: Comparison to NULL could be written "of_get_property"
  #214: FILE: arch/powerpc/platforms/powermac/setup.c:155:
  + if (of_get_property(np, "cache-unified", NULL) != NULL && dc) {

  CHECK: Comparison to NULL could be written "!np"
  #223: FILE: arch/powerpc/platforms/powermac/setup.c:247:
  + if (np == NULL)

  CHECK: Comparison to NULL could be written "np"
  #226: FILE: 

Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Linus Torvalds
On Fri, Apr 13, 2018 at 10:54 AM, Russell King - ARM Linux
 wrote:
>
> FPE_FLTINV means "floating point invalid operation".  Does it really
> cover the case where hardware has failed, or is it intended to cover
> the case where userspace did something wrong and asked for an invalid
> operation from the FP hardware?

Note that the number of people who actually look at the si_code is
approximately zero.

But the ones that _do_ check the si_code are certainly not going to
check it against a new code that they don't know about.

I suspect that if you start searching for FLT_xyz occurrences in code,
approximately 100% of them are from the kernel code that generates
them, not from any actual users.

So I'd be very surprised if you can find *anybody* who cares about
that exact value (with the possible exceptions of test-suites).

Sadly, google code-search is no more. It was useful for things like that.

Linus


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Russell King - ARM Linux
On Fri, Apr 13, 2018 at 06:08:28PM +0100, Dave Martin wrote:
> On Fri, Apr 13, 2018 at 09:33:17AM -0700, Linus Torvalds wrote:
> > On Fri, Apr 13, 2018 at 2:42 AM, Russell King - ARM Linux
> >  wrote:
> > >
> > > Yes, it does solve the problem at hand with strace - the exact patch I
> > > tested against 4.16 is below.
> > 
> > Ok, good.
> > 
> > > However, FPE_FLTUNK is not defined in older kernels, so while we can
> > > fix it this way for the current merge window, that doesn't help 4.16.
> > 
> > I wonder if we should even bother with FPE_FLTUNK.
> > 
> > I suspect we might as well use FPE_FLTINV, I suspect, and not have
> > this complexity at all. That case is not worth worrying about, since
> > it's a "this shouldn't happen anyway" and the *real* reason will be in
> > the kernel logs due to vfs_panic().
> > 
> > So it's not like this is something that the user should ever care
> > about the si_code about.
> 
> Ack, my intended meaning for FPE_FLTUNK is that the fp exception is
> either spurious or we can't tell easily (or possibly at all) which
> FPE_XXX should be returned.  It's up to userspace to figure it out
> if it really cares.  Previously we were accidentally returning SI_USER
> in si_code for arm64.
> 
> This case on arm looks like a more serious error for which FPE_FLTINV
> may be more appropriate anyway.

No.  The cases where we get to this point are:

1. A trap concerning a coprocessor register transfer instruction (iow, move
   between a VFP register and ARM register.)
2. A trap concerning a coprocessor register load or save instruction.

(In both of these, "concerning" means that the VFP hardware provides
such an instruction as the reason for the fault, *not* that it is the
faulting instruction.)

3. A combination of the exception bits (EX and DEX) on certain VFP
   implementations.

All of these can be summarised as "the hardware went wrong in some way"
rather than "the user program did something wrong."

FPE_FLTINV means "floating point invalid operation".  Does it really
cover the case where hardware has failed, or is it intended to cover
the case where userspace did something wrong and asked for an invalid
operation from the FP hardware?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Linus Torvalds
On Fri, Apr 13, 2018 at 2:42 AM, Russell King - ARM Linux
 wrote:
>
> Yes, it does solve the problem at hand with strace - the exact patch I
> tested against 4.16 is below.

Ok, good.

> However, FPE_FLTUNK is not defined in older kernels, so while we can
> fix it this way for the current merge window, that doesn't help 4.16.

I wonder if we should even bother with FPE_FLTUNK.

I suspect we might as well use FPE_FLTINV, I suspect, and not have
this complexity at all. That case is not worth worrying about, since
it's a "this shouldn't happen anyway" and the *real* reason will be in
the kernel logs due to vfs_panic().

So it's not like this is something that the user should ever care
about the si_code about.

> Given that the path we're talking about is unlikely to happen (as
> mentioned in my second paragraph) I still think reverting Eric's patch
> is the right way forward for older kernels.

I'd much rather get rid of that FPE_FIXME, and leave that whole mess behind.

So the attached patch seems simple and should work with 4.16 too.

Let's not leave this as some kind of nasty maintenance issue, and just
go for simple and stupid.

Hmm?

Linus
 arch/arm/include/uapi/asm/siginfo.h | 13 -
 arch/arm/vfp/vfpmodule.c|  2 +-
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/arm/include/uapi/asm/siginfo.h b/arch/arm/include/uapi/asm/siginfo.h
deleted file mode 100644
index d0513880be21..
--- a/arch/arm/include/uapi/asm/siginfo.h
+++ /dev/null
@@ -1,13 +0,0 @@
-#ifndef __ASM_SIGINFO_H
-#define __ASM_SIGINFO_H
-
-#include 
-
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME	0	/* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
-#endif
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 4c375e11ae95..af4ee2cef2f9 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -257,7 +257,7 @@ static void vfp_raise_exceptions(u32 exceptions, u32 inst, u32 fpscr, struct pt_
 
 	if (exceptions == VFP_EXCEPTION_ERROR) {
 		vfp_panic("unhandled bounce", inst);
-		vfp_raise_sigfpe(FPE_FIXME, regs);
+		vfp_raise_sigfpe(FPE_FLTINV, regs);
 		return;
 	}
 


Re: [PATCH v9 00/24] Speculative page faults

2018-04-13 Thread Laurent Dufour
On 14/03/2018 14:11, Michal Hocko wrote:
> On Tue 13-03-18 18:59:30, Laurent Dufour wrote:
>> Changes since v8:
>>  - Don't check PMD when locking the pte when THP is disabled
>>Thanks to Daniel Jordan for reporting this.
>>  - Rebase on 4.16
> 
> Is this really worth reposting the whole pile? I mean this is at v9,
> each doing little changes. It is quite tiresome to barely get to a
> bookmarked version just to find out that there are 2 new versions out.
> 
> I am sorry to be grumpy and I can understand some frustration it doesn't
> move forward that easilly but this is a _big_ change. We should start
> with a real high level review rather than doing small changes here and
> there and reach v20 quickly.

I know this would mean v10, but there has been a bunch of reviews from David
Rientjes and Jerome Glisse, and I had to make many changes to address them.
So I think this is time to push a v10.

If you have already started a review of this v9 series, please send me your
remarks so that I can compile them in this v10 asap.

Thanks,
Laurent.



[PATCH v2] cxl: Set the PBCQ Tunnel BAR register when enabling capi mode

2018-04-13 Thread Philippe Bergheaud
Skiboot used to set the default Tunnel BAR register value when capi mode
was enabled. This approach was ok for the cxl driver, but prevented other
drivers from choosing different values.

Skiboot versions > 5.11 will not set the default value any longer. This
patch modifies the cxl driver to set/reset the Tunnel BAR register when
entering/exiting the cxl mode, with pnv_pci_set_tunnel_bar().

Signed-off-by: Philippe Bergheaud 
---
Changelog:

v2: Restrict tunnel bar setting to power9.
Do not fail cxl_configure_adapter() on tunnel bar setting error.
Log an info message instead, and continue configuring capi mode.
---
 drivers/misc/cxl/pci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 83f1d08058fc..355c789406f7 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1742,6 +1742,10 @@ static int cxl_configure_adapter(struct cxl *adapter, 
struct pci_dev *dev)
/* Required for devices using CAPP DMA mode, harmless for others */
pci_set_master(dev);
 
+   if (cxl_is_power9())
+   if (pnv_pci_set_tunnel_bar(dev, 0x0002E000ull, 1))
+   dev_info(>dev, "Tunneled operations 
unsupported\n");
+
if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode)))
goto err;
 
@@ -1768,6 +1772,8 @@ static void cxl_deconfigure_adapter(struct cxl *adapter)
 {
struct pci_dev *pdev = to_pci_dev(adapter->dev.parent);
 
+   if (cxl_is_power9())
+   pnv_pci_set_tunnel_bar(pdev, 0x0002E000ull, 0);
cxl_native_release_psl_err_irq(adapter);
cxl_unmap_adapter_regs(adapter);
 
-- 
2.16.2



Re: [PATCH 2/2] powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters

2018-04-13 Thread Balbir Singh
On Fri, Apr 13, 2018 at 12:02 PM, Mark Hairgrove  wrote:
>
>
> On Wed, 11 Apr 2018, Alistair Popple wrote:
>
>> There is a single npu context per set of callback parameters. Callers
>> should be prevented from overwriting existing callback values so instead
>> return an error if different parameters are passed.
>>
>> Signed-off-by: Alistair Popple 
>> ---
>>  arch/powerpc/include/asm/powernv.h   |  2 +-
>>  arch/powerpc/platforms/powernv/npu-dma.c | 16 +---
>>  2 files changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/powernv.h 
>> b/arch/powerpc/include/asm/powernv.h
>> index dc5f6a5d4575..362ea12a4501 100644
>> --- a/arch/powerpc/include/asm/powernv.h
>> +++ b/arch/powerpc/include/asm/powernv.h
>> @@ -15,7 +15,7 @@
>>  extern void powernv_set_nmmu_ptcr(unsigned long ptcr);
>>  extern struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
>>   unsigned long flags,
>> - struct npu_context *(*cb)(struct npu_context *, void 
>> *),
>> + void (*cb)(struct npu_context *, void *),
>>   void *priv);
>>  extern void pnv_npu2_destroy_context(struct npu_context *context,
>>   struct pci_dev *gpdev);
>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c 
>> b/arch/powerpc/platforms/powernv/npu-dma.c
>> index cb77162f4e7a..193f43ea3fbc 100644
>> --- a/arch/powerpc/platforms/powernv/npu-dma.c
>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
>> @@ -407,7 +407,7 @@ struct npu_context {
>>   bool nmmu_flush;
>>
>>   /* Callback to stop translation requests on a given GPU */
>> - struct npu_context *(*release_cb)(struct npu_context *, void *);
>> + void (*release_cb)(struct npu_context *context, void *priv);
>>
>>   /*
>>* Private pointer passed to the above callback for usage by
>> @@ -705,7 +705,7 @@ static const struct mmu_notifier_ops 
>> nv_nmmu_notifier_ops = {
>>   */
>>  struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
>>   unsigned long flags,
>> - struct npu_context *(*cb)(struct npu_context *, void 
>> *),
>> + void (*cb)(struct npu_context *, void *),
>>   void *priv)
>>  {
>>   int rc;
>> @@ -763,8 +763,18 @@ struct npu_context *pnv_npu2_init_context(struct 
>> pci_dev *gpdev,
>>*/
>>   spin_lock(_context_lock);
>>   npu_context = mm->context.npu_context;
>> - if (npu_context)
>> + if (npu_context) {
>> + if (npu_context->release_cb != cb ||
>> + npu_context->priv != priv) {
>> + spin_unlock(_context_lock);
>> + opal_npu_destroy_context(nphb->opal_id, mm->context.id,
>> + PCI_DEVID(gpdev->bus->number,
>> + gpdev->devfn));
>> + return ERR_PTR(-EINVAL);
>> + }
>> +
>>   WARN_ON(!kref_get_unless_zero(_context->kref));
>> + }
>>   spin_unlock(_context_lock);
>>
>>   if (!npu_context) {
>> --
>> 2.11.0
>>
>>
>
> Reviewed-by: Mark Hairgrove 
> Tested-by: Mark Hairgrove 
>

Reviewed-by: Balbir Singh 


Re: [PATCH 1/2] powernv/npu: Add lock to prevent race in concurrent context init/destroy

2018-04-13 Thread Balbir Singh
On Wed, Apr 11, 2018 at 4:38 PM, Alistair Popple  wrote:
> The pnv_npu2_init_context() and pnv_npu2_destroy_context() functions are
> used to allocate/free contexts to allow address translation and shootdown
> by the NPU on a particular GPU. Context initialisation is implicitly safe
> as it is protected by the requirement mmap_sem be held in write mode,
> however pnv_npu2_destroy_context() does not require mmap_sem to be held and
> it is not safe to call with a concurrent initialisation for a different
> GPU.
>
> It was assumed the driver would ensure destruction was not called
> concurrently with initialisation. However the driver may be simplified by
> allowing concurrent initialisation and destruction for different GPUs. As
> npu context creation/destruction is not a performance critical path and the
> critical section is not large a single spinlock is used for simplicity.
>
> Signed-off-by: Alistair Popple 
> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 51 
> ++--
>  1 file changed, 42 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c 
> b/arch/powerpc/platforms/powernv/npu-dma.c
> index 1cbef1f9cd37..cb77162f4e7a 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -34,6 +34,12 @@
>  #define npu_to_phb(x) container_of(x, struct pnv_phb, npu)
>
>  /*
> + * spinlock to protect initialisation of an npu_context for a particular
> + * mm_struct.
> + */
> +DEFINE_SPINLOCK(npu_context_lock);
> +
> +/*
>   * Other types of TCE cache invalidation are not functional in the
>   * hardware.
>   */
> @@ -694,7 +700,8 @@ static const struct mmu_notifier_ops nv_nmmu_notifier_ops 
> = {
>   * Returns an error if there no contexts are currently available or a
>   * npu_context which should be passed to pnv_npu2_handle_fault().
>   *
> - * mmap_sem must be held in write mode.
> + * mmap_sem must be held in write mode and must not be called from interrupt
> + * context.
>   */
>  struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
> unsigned long flags,
> @@ -741,7 +748,9 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev 
> *gpdev,
> /*
>  * Setup the NPU context table for a particular GPU. These need to be
>  * per-GPU as we need the tables to filter ATSDs when there are no
> -* active contexts on a particular GPU.
> +* active contexts on a particular GPU. It is safe for these to be
> +* called concurrently with destroy as the OPAL call takes appropriate
> +* locks and refcounts on init/destroy.
>  */
> rc = opal_npu_init_context(nphb->opal_id, mm->context.id, flags,
> PCI_DEVID(gpdev->bus->number, gpdev->devfn));
> @@ -752,8 +761,19 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev 
> *gpdev,
>  * We store the npu pci device so we can more easily get at the
>  * associated npus.
>  */
> +   spin_lock(_context_lock);
> npu_context = mm->context.npu_context;
> +   if (npu_context)
> +   WARN_ON(!kref_get_unless_zero(_context->kref));
> +   spin_unlock(_context_lock);
> +
> if (!npu_context) {
> +   /*
> +* We can set up these fields without holding the
> +* npu_context_lock as the npu_context hasn't been returned to
> +* the caller meaning it can't be destroyed. Parallel 
> allocation
> +* is protected against by mmap_sem.
> +*/
> rc = -ENOMEM;
> npu_context = kzalloc(sizeof(struct npu_context), GFP_KERNEL);
> if (npu_context) {
> @@ -772,8 +792,6 @@ struct npu_context *pnv_npu2_init_context(struct pci_dev 
> *gpdev,
> }
>
> mm->context.npu_context = npu_context;
> -   } else {
> -   WARN_ON(!kref_get_unless_zero(_context->kref));
> }
>
> npu_context->release_cb = cb;
> @@ -811,15 +829,16 @@ static void pnv_npu2_release_context(struct kref *kref)
> mm_context_remove_copro(npu_context->mm);
>
> npu_context->mm->context.npu_context = NULL;
> -   mmu_notifier_unregister(_context->mn,
> -   npu_context->mm);
> -
> -   kfree(npu_context);
>  }
>
> +/*
> + * Destroy a context on the given GPU. May free the npu_context if it is no
> + * longer active on any GPUs. Must not be called from interrupt context.
> + */
>  void pnv_npu2_destroy_context(struct npu_context *npu_context,
> struct pci_dev *gpdev)
>  {
> +   int removed;
> struct pnv_phb *nphb;
> struct npu *npu;
> struct pci_dev *npdev = pnv_pci_get_npu_dev(gpdev, 0);
> @@ -841,7 +860,21 @@ void pnv_npu2_destroy_context(struct npu_context 
> *npu_context,
> 

Re: [PATCH] powerpc/sparse: fix plain integer as NULL pointer warning

2018-04-13 Thread kbuild test robot
Hi Mathieu,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on v4.16 next-20180413]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Mathieu-Malaterre/powerpc-sparse-fix-plain-integer-as-NULL-pointer-warning/20180413-144954
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All warnings (new ones prefixed by >>):

   drivers/macintosh/via-pmu.c: In function 'pmu_present':
>> drivers/macintosh/via-pmu.c:1752:9: warning: return makes integer from 
>> pointer without a cast [-Wint-conversion]
 return via;
^~~

vim +1752 drivers/macintosh/via-pmu.c

  1748  
  1749  int
  1750  pmu_present(void)
  1751  {
> 1752  return via;
  1753  }
  1754  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] cxl: Set the PBCQ Tunnel BAR register when enabling capi mode

2018-04-13 Thread Frederic Barrat



Le 12/04/2018 à 13:06, Philippe Bergheaud a écrit :

Skiboot used to set the default Tunnel BAR register value when capi mode
was enabled. This approach was ok for the cxl driver, but prevented other
drivers from choosing different values.

Skiboot versions > 5.11 will not set the default value any longer. This
patch modifies the cxl driver to set/reset the Tunnel BAR register when
entering/exiting the cxl mode, with pnv_pci_set_tunnel_bar().

Signed-off-by: Philippe Bergheaud 
---
  drivers/misc/cxl/pci.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 83f1d08058fc..3beff9188446 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1742,6 +1742,9 @@ static int cxl_configure_adapter(struct cxl *adapter, 
struct pci_dev *dev)
/* Required for devices using CAPP DMA mode, harmless for others */
pci_set_master(dev);
  
+	if ((rc = pnv_pci_set_tunnel_bar(dev, 0x0002E000ull, 1)))

+   goto err;
+



Isn't that call going to fail on older skiboot which don't support 
OPAL_PCI_SET_PBCQ_TUNNEL_BAR, i.e. on p8?


  Fred



if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode)))
goto err;
  
@@ -1768,6 +1771,7 @@ static void cxl_deconfigure_adapter(struct cxl *adapter)

  {
struct pci_dev *pdev = to_pci_dev(adapter->dev.parent);
  
+	pnv_pci_set_tunnel_bar(pdev, 0x0002E000ull, 0);

cxl_native_release_psl_err_irq(adapter);
cxl_unmap_adapter_regs(adapter);
  





Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Russell King - ARM Linux
On Thu, Apr 12, 2018 at 10:22:15AM -0700, Linus Torvalds wrote:
> On Thu, Apr 12, 2018 at 10:20 AM, Russell King - ARM Linux
>  wrote:
> >
> > This file was created to contain FPE_FIXME, by the "signal/arm: Document
> > conflicts with SI_USER and SIGFPE" commit so if we're removing it, it
> > would be better to remove the whole file.
> 
> Fair enough. I'm not going to commit that anyway since I can't test
> it, but yes, if it tests ok that sounds like the right thing to do.

Yes, it does solve the problem at hand with strace - the exact patch I
tested against 4.16 is below.

Testing this exact code path (exceptions set to VFP_EXCEPTION_ERROR)
is something that can only happen if the hardware does something stupid,
and I don't have a way of making it do that, so the code path can't be
tested.

However, FPE_FLTUNK is not defined in older kernels, so while we can
fix it this way for the current merge window, that doesn't help 4.16.
How we solve that depends what happens with Eric's patch (266da65e9156
("signal: Add FPE_FLTUNK si_code for undiagnosable fp exceptions"))
that introduces FPE_FLTUNK - and there's also the problem of NSIGFPE,
which kernel/signal.c uses in the path that selects the siginfo layout.

Given that the path we're talking about is unlikely to happen (as
mentioned in my second paragraph) I still think reverting Eric's patch
is the right way forward for older kernels.

(Note, my previous comment about the si_code initialiser was incorrect.)

diff --git a/arch/arm/include/uapi/asm/siginfo.h 
b/arch/arm/include/uapi/asm/siginfo.h
deleted file mode 100644
index d0513880be21..
--- a/arch/arm/include/uapi/asm/siginfo.h
+++ /dev/null
@@ -1,13 +0,0 @@
-#ifndef __ASM_SIGINFO_H
-#define __ASM_SIGINFO_H
-
-#include 
-
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
-#endif
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 4c375e11ae95..8a1a5e6048d2 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -28,6 +28,10 @@
 #include 
 #include 
 
+#ifndef FPE_FLTUNK
+#define FPE_FLTUNK 14
+#endif
+
 #include "vfpinstr.h"
 #include "vfp.h"
 
@@ -257,7 +261,7 @@ static void vfp_raise_exceptions(u32 exceptions, u32 inst, 
u32 fpscr, struct pt_
 
if (exceptions == VFP_EXCEPTION_ERROR) {
vfp_panic("unhandled bounce", inst);
-   vfp_raise_sigfpe(FPE_FIXME, regs);
+   vfp_raise_sigfpe(FPE_FLTUNK, regs);
return;
}
 


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up