Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-08-02 Thread Lynn Boger
Fixes will be in Go 1.16.7 and 1.15.15. Backports are no longer being 
done for Go 1.14.


On 8/2/21 1:18 AM, Michael Ellerman wrote:

"Paul Murphy"  writes:
  
(My apologies for however IBM's email client munges this)



I heard it is going to be in Go 1.16.7, but I do not know much about Go.
Maybe the folks in Cc can chime in.
  
  
We have backports primed and ready for the next point release. They

are waiting on the release manager to cherrypick them.

OK good, that is still the correct fix in the long run.


I think we were aware that our VDSO usage may have exploited some
peculiarities in how the ppc64 version was constructed (i.e hand
written assembly which just didn't happen to clobber R30).

Yeah I was "somewhat surprised" that Go thought it could use r30 like
that across a VDSO call :D

But to be fair the ABI of the VDSO has always been a little fishy,
because the entry points pretend to be a transparent wrapper around
system calls, but then in a case like this are not.


Go up to this point has only used the vdso function __kernel_clock_gettime; it
is the only entry point which would need to explicitly avoid R30 for
Go's sake.

I thought about limiting the workaround to just that code, but it seemed
silly and likely to come back to bite us. Once the compilers decides to
spill a non-volatile there are plenty of other ones to choose from.

cheers


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-08-02 Thread Segher Boessenkool
On Mon, Aug 02, 2021 at 04:18:58PM +1000, Michael Ellerman wrote:
> > Go up to this point has only used the vdso function __kernel_clock_gettime; 
> > it
> > is the only entry point which would need to explicitly avoid R30 for
> > Go's sake.
> 
> I thought about limiting the workaround to just that code, but it seemed
> silly and likely to come back to bite us. Once the compilers decides to
> spill a non-volatile there are plenty of other ones to choose from.

It can be cheaper to spill N..31 consecutively, using stmw for example.
For 64-bit Power implementations it doesn't currently make any
difference.  Since none of this will be inlined it doesn't have any real
impact.

(This also happens with -m32 -fpic, which always sets GPR30 as fixed, it
is the offset table register.  With those flags -ffixed-r30 doesn't do
anything btw (r30 already *is* a fixed function register), and this will
not work with a Go that clobbers r30.  But this is academic :-) )


Segher


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-08-02 Thread Benjamin Herrenschmidt
On Mon, 2021-08-02 at 16:18 +1000, Michael Ellerman wrote:
> 
> But to be fair the ABI of the VDSO has always been a little fishy,
> 
> because the entry points pretend to be a transparent wrapper around
> 
> system calls, but then in a case like this are not.

This is somewhat debatable :-) If your perspective is from an
application, whether your wrapper is glibc, the vdso or *** knows what
else, you can't make assumptions about the state of the registers on a
signal hitting somewhere in your call chain other than what's
guanranteed by the ABI overall (ie, TLS etc...).

Nowhere was it written that a VDSO call behaved strictly like an sc
instruction :-)
> 
> > Go up to this point has only used the vdso function __kernel_clock_gettime; 
> > it
> > is the only entry point which would need to explicitly avoid R30 for
> > Go's sake.
> 
> 
> I thought about limiting the workaround to just that code, but it seemed
> silly and likely to come back to bite us. Once the compilers decides to
> spill a non-volatile there are plenty of other ones to choose from.

Yeah fine graining this is a waste of time, I agree. Just stick with
fixing r30 for the whole vdso, it won't actually hurt, just make sure
this is somewhat documented as to why we do it (I don't remember what
you patch does off hand, I assume at least your git commit has all the
data, a comment near the offending line in the Makefile might be good
too).

Cheers,
Ben.




RE: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-08-02 Thread Michael Ellerman
"Paul Murphy"  writes:
>  
> (My apologies for however IBM's email client munges this)
>
>> I heard it is going to be in Go 1.16.7, but I do not know much about Go.
>> Maybe the folks in Cc can chime in.
>  
>  
> We have backports primed and ready for the next point release. They
> are waiting on the release manager to cherrypick them.

OK good, that is still the correct fix in the long run.

> I think we were aware that our VDSO usage may have exploited some
> peculiarities in how the ppc64 version was constructed (i.e hand
> written assembly which just didn't happen to clobber R30).

Yeah I was "somewhat surprised" that Go thought it could use r30 like
that across a VDSO call :D

But to be fair the ABI of the VDSO has always been a little fishy,
because the entry points pretend to be a transparent wrapper around
system calls, but then in a case like this are not.

> Go up to this point has only used the vdso function __kernel_clock_gettime; it
> is the only entry point which would need to explicitly avoid R30 for
> Go's sake.

I thought about limiting the workaround to just that code, but it seemed
silly and likely to come back to bite us. Once the compilers decides to
spill a non-volatile there are plenty of other ones to choose from.

cheers


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-08-02 Thread Michael Ellerman
Andreas Schwab  writes:
> On Jul 29 2021, Michael Ellerman wrote:
>
>> I haven't been able to reproduce the crash by following the instructions
>> in your bug report, I have go1.13.8, I guess the crash is only in newer
>> versions?
>
> Yes, only go1.14 and later are affected.
>
> https://build.opensuse.org/package/live_build_log/openSUSE:Factory:PowerPC/go1.13/standard/ppc64
> https://build.opensuse.org/package/live_build_log/openSUSE:Factory:PowerPC/go1.14/standard/ppc64

Thanks. That helps explain why I didn't see it, my test boxes have 1.13 
installed.

cheers


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-29 Thread Michael Ellerman
Paul Menzel  writes:
> Am 29.07.21 um 09:41 schrieb Michael Ellerman:
>> Paul Menzel  writes:
>
>>> Am 28.07.21 um 14:43 schrieb Michael Ellerman:
 Paul Menzel  writes:
> Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt:
>> On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:
>
>>> On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation
>>> fault [1], and it might be related to *[release-branch.go1.16] runtime:
>>> fix crash during VDSO calls on PowerPC* [2], conjecturing that commit
>>> ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
>>> added in Linux 5.11 causes this.
>>>
>>> If this is indeed the case, this would be a regression in userspace. Is
>>> there a generic fix or should the change be reverted?
>>
>>   From the look at the links you posted, this appears to be completely
>> broken assumptions by Go that some registers don't change while calling
>> what essentially are external library functions *while inside those
>> functions* (ie in this case from a signal handler).
>>
>> I suppose it would be possible to build the VDSO with gcc arguments to
>> make it not use r30, but that's just gross...
>
> Thank you for looking into this. No idea, if it falls under Linux’ no
> regression policy or not.

 Reluctantly yes, I think it does. Though it would have been good if it
 had been reported to us sooner.

 It looks like that Go fix is only committed to master, and neither of
 the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way
 for a user to get a working version of Go other than building master?
>>>
>>> I heard it is going to be in Go 1.16.7, but I do not know much about Go.
>>> Maybe the folks in Cc can chime in.
>>>
 I'll see if we can work around it in the kernel. Are you able to test a
 kernel patch if I send you one?
>>>
>>> Yes, I could test a Linux kernel patch on ppc64le (POWER 8) running
>>> Ubuntu 21.04.
>> 
>> Thanks, would be great if you can test on your setup. Patch below.
>> 
>> I haven't been able to reproduce the crash by following the instructions
>> in your bug report, I have go1.13.8, I guess the crash is only in newer
>> versions?
>
> I only used go version 1.16.2 packaged in Ubuntu 21.04 (1.16~0ubuntu1).

OK thanks. I can reproduce with that.

>> diff --git a/arch/powerpc/kernel/vdso64/Makefile 
>> b/arch/powerpc/kernel/vdso64/Makefile
>> index 2813e3f98db6..3c5baaa6f1e7 100644
>> --- a/arch/powerpc/kernel/vdso64/Makefile
>> +++ b/arch/powerpc/kernel/vdso64/Makefile
>> @@ -27,6 +27,13 @@ KASAN_SANITIZE := n
>>   
>>   ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
>>  -Wl,-soname=linux-vdso64.so.1 -Wl,--hash-style=both
>> +
>> +# Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That 
>> used to be true
>
> Probably that needs to be 1.16.7.

I made it 1.16.x because it wasn't 100% clear on whether the fix will
land in 1.16.7 or not.

For our purposes 1.16.x is good enough, we'll need to carry the
workaround until 1.16 is well and truly EOL, so it doesn't really matter
which point release it's in.

>> +# by accident when the VDSO was hand-written asm code, but may not be now 
>> that the VDSO is
>> +# compiler generated. To avoid breaking Go tell GCC not to use r30. Impact 
>> on code
>> +# generation is minimal, it will just use r29 instead.
>> +ccflags-y += $(call cc-option, -ffixed-r30)
>> +
>>   asflags-y := -D__VDSO64__ -s
>>   
>>   targets += vdso64.lds
>
> With this applied to Linux, go does not crash with a segmentation fault 
> anymore.
>
> Tested-by: Paul Menzel 

Thanks.

> (Probably the commit should be tagged for the stable series too.)

Yep.

cheers


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-29 Thread Andreas Schwab
On Jul 29 2021, Michael Ellerman wrote:

> I haven't been able to reproduce the crash by following the instructions
> in your bug report, I have go1.13.8, I guess the crash is only in newer
> versions?

Yes, only go1.14 and later are affected.

https://build.opensuse.org/package/live_build_log/openSUSE:Factory:PowerPC/go1.13/standard/ppc64
https://build.opensuse.org/package/live_build_log/openSUSE:Factory:PowerPC/go1.14/standard/ppc64

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-29 Thread Paul Menzel

Dear Michael,


Am 29.07.21 um 09:41 schrieb Michael Ellerman:

Paul Menzel  writes:



Am 28.07.21 um 14:43 schrieb Michael Ellerman:

Paul Menzel  writes:

Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt:

On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:



On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation
fault [1], and it might be related to *[release-branch.go1.16] runtime:
fix crash during VDSO calls on PowerPC* [2], conjecturing that commit
ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
added in Linux 5.11 causes this.

If this is indeed the case, this would be a regression in userspace. Is
there a generic fix or should the change be reverted?


  From the look at the links you posted, this appears to be completely
broken assumptions by Go that some registers don't change while calling
what essentially are external library functions *while inside those
functions* (ie in this case from a signal handler).

I suppose it would be possible to build the VDSO with gcc arguments to
make it not use r30, but that's just gross...


Thank you for looking into this. No idea, if it falls under Linux’ no
regression policy or not.


Reluctantly yes, I think it does. Though it would have been good if it
had been reported to us sooner.

It looks like that Go fix is only committed to master, and neither of
the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way
for a user to get a working version of Go other than building master?


I heard it is going to be in Go 1.16.7, but I do not know much about Go.
Maybe the folks in Cc can chime in.


I'll see if we can work around it in the kernel. Are you able to test a
kernel patch if I send you one?


Yes, I could test a Linux kernel patch on ppc64le (POWER 8) running
Ubuntu 21.04.


Thanks, would be great if you can test on your setup. Patch below.

I haven't been able to reproduce the crash by following the instructions
in your bug report, I have go1.13.8, I guess the crash is only in newer
versions?


I only used go version 1.16.2 packaged in Ubuntu 21.04 (1.16~0ubuntu1).


diff --git a/arch/powerpc/kernel/vdso64/Makefile 
b/arch/powerpc/kernel/vdso64/Makefile
index 2813e3f98db6..3c5baaa6f1e7 100644
--- a/arch/powerpc/kernel/vdso64/Makefile
+++ b/arch/powerpc/kernel/vdso64/Makefile
@@ -27,6 +27,13 @@ KASAN_SANITIZE := n
  
  ccflags-y := -shared -fno-common -fno-builtin -nostdlib \

-Wl,-soname=linux-vdso64.so.1 -Wl,--hash-style=both
+
+# Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used 
to be true


Probably that needs to be 1.16.7.


+# by accident when the VDSO was hand-written asm code, but may not be now that 
the VDSO is
+# compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on 
code
+# generation is minimal, it will just use r29 instead.
+ccflags-y += $(call cc-option, -ffixed-r30)
+
  asflags-y := -D__VDSO64__ -s
  
  targets += vdso64.lds


With this applied to Linux, go does not crash with a segmentation fault 
anymore.


Tested-by: Paul Menzel 

(Probably the commit should be tagged for the stable series too.)


Kind regards,

Paul


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-29 Thread Michael Ellerman
Paul Menzel  writes:

> Dear Michael,
>
>
> Am 28.07.21 um 14:43 schrieb Michael Ellerman:
>> Paul Menzel  writes:
>>> Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt:
 On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:
>>>
> On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation
> fault [1], and it might be related to *[release-branch.go1.16] runtime:
> fix crash during VDSO calls on PowerPC* [2], conjecturing that commit
> ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
> added in Linux 5.11 causes this.
>
> If this is indeed the case, this would be a regression in userspace. Is
> there a generic fix or should the change be reverted?

  From the look at the links you posted, this appears to be completely
 broken assumptions by Go that some registers don't change while calling
 what essentially are external library functions *while inside those
 functions* (ie in this case from a signal handler).

 I suppose it would be possible to build the VDSO with gcc arguments to
 make it not use r30, but that's just gross...
>>>
>>> Thank you for looking into this. No idea, if it falls under Linux’ no
>>> regression policy or not.
>> 
>> Reluctantly yes, I think it does. Though it would have been good if it
>> had been reported to us sooner.
>> 
>> It looks like that Go fix is only committed to master, and neither of
>> the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way
>> for a user to get a working version of Go other than building master?
>
> I heard it is going to be in Go 1.16.7, but I do not know much about Go. 
> Maybe the folks in Cc can chime in.
>
>> I'll see if we can work around it in the kernel. Are you able to test a
>> kernel patch if I send you one?
>
> Yes, I could test a Linux kernel patch on ppc64le (POWER 8) running 
> Ubuntu 21.04.

Thanks, would be great if you can test on your setup. Patch below.

I haven't been able to reproduce the crash by following the instructions
in your bug report, I have go1.13.8, I guess the crash is only in newer
versions?

cheers


diff --git a/arch/powerpc/kernel/vdso64/Makefile 
b/arch/powerpc/kernel/vdso64/Makefile
index 2813e3f98db6..3c5baaa6f1e7 100644
--- a/arch/powerpc/kernel/vdso64/Makefile
+++ b/arch/powerpc/kernel/vdso64/Makefile
@@ -27,6 +27,13 @@ KASAN_SANITIZE := n
 
 ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
-Wl,-soname=linux-vdso64.so.1 -Wl,--hash-style=both
+
+# Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used 
to be true
+# by accident when the VDSO was hand-written asm code, but may not be now that 
the VDSO is
+# compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on 
code
+# generation is minimal, it will just use r29 instead.
+ccflags-y += $(call cc-option, -ffixed-r30)
+
 asflags-y := -D__VDSO64__ -s
 
 targets += vdso64.lds



RE: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-28 Thread Paul Murphy
 
(My apologies for however IBM's email client munges this)
> I heard it is going to be in Go 1.16.7, but I do not know much about Go.> Maybe the folks in Cc can chime in.
 
 
We have backports primed and ready for the next point release. They are waiting on the release manager to cherrypick them.
 
I think we were aware that our VDSO usage may have exploited some peculiarities in how the ppc64 version was constructed (i.e hand written assembly which just didn't happen to clobber R30). Go up to this point has only used the vdso function __kernel_clock_gettime; it is the only entry point which would need to explicitly avoid R30 for Go's sake.
 
Paul M.
 



Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-28 Thread Paul Menzel

Dear Michael,


Am 28.07.21 um 14:43 schrieb Michael Ellerman:

Paul Menzel  writes:

Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt:

On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:



On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation
fault [1], and it might be related to *[release-branch.go1.16] runtime:
fix crash during VDSO calls on PowerPC* [2], conjecturing that commit
ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
added in Linux 5.11 causes this.

If this is indeed the case, this would be a regression in userspace. Is
there a generic fix or should the change be reverted?


 From the look at the links you posted, this appears to be completely
broken assumptions by Go that some registers don't change while calling
what essentially are external library functions *while inside those
functions* (ie in this case from a signal handler).

I suppose it would be possible to build the VDSO with gcc arguments to
make it not use r30, but that's just gross...


Thank you for looking into this. No idea, if it falls under Linux’ no
regression policy or not.


Reluctantly yes, I think it does. Though it would have been good if it
had been reported to us sooner.

It looks like that Go fix is only committed to master, and neither of
the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way
for a user to get a working version of Go other than building master?


I heard it is going to be in Go 1.16.7, but I do not know much about Go. 
Maybe the folks in Cc can chime in.



I'll see if we can work around it in the kernel. Are you able to test a
kernel patch if I send you one?


Yes, I could test a Linux kernel patch on ppc64le (POWER 8) running 
Ubuntu 21.04.



Kind regards,

Paul


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-28 Thread Michael Ellerman
Paul Menzel  writes:
> Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt:
>> On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:
>
>>> On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation
>>> fault [1], and it might be related to *[release-branch.go1.16] runtime:
>>> fix crash during VDSO calls on PowerPC* [2], conjecturing that commit
>>> ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
>>> added in Linux 5.11 causes this.
>>>
>>> If this is indeed the case, this would be a regression in userspace. Is
>>> there a generic fix or should the change be reverted?
>> 
>> From the look at the links you posted, this appears to be completely
>> broken assumptions by Go that some registers don't change while calling
>> what essentially are external library functions *while inside those
>> functions* (ie in this case from a signal handler).
>> 
>> I suppose it would be possible to build the VDSO with gcc arguments to
>> make it not use r30, but that's just gross...
>
> Thank you for looking into this. No idea, if it falls under Linux’ no 
> regression policy or not.

Reluctantly yes, I think it does. Though it would have been good if it
had been reported to us sooner.

It looks like that Go fix is only committed to master, and neither of
the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way
for a user to get a working version of Go other than building master?

I'll see if we can work around it in the kernel. Are you able to test a
kernel patch if I send you one?

cheers


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-28 Thread Paul Menzel

Dear Benjamin,


Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt:

On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:



On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation
fault [1], and it might be related to *[release-branch.go1.16] runtime:
fix crash during VDSO calls on PowerPC* [2], conjecturing that commit
ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
added in Linux 5.11 causes this.

If this is indeed the case, this would be a regression in userspace. Is
there a generic fix or should the change be reverted?


From the look at the links you posted, this appears to be completely
broken assumptions by Go that some registers don't change while calling
what essentially are external library functions *while inside those
functions* (ie in this case from a signal handler).

I suppose it would be possible to build the VDSO with gcc arguments to
make it not use r30, but that's just gross...


Thank you for looking into this. No idea, if it falls under Linux’ no 
regression policy or not.



Kind regards,

Paul


Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)

2021-07-27 Thread Benjamin Herrenschmidt
On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote:
> Dear Christophe,
> 
> 
> On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation 
> fault [1], and it might be related to *[release-branch.go1.16] runtime: 
> fix crash during VDSO calls on PowerPC* [2], conjecturing that commit 
> ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.) 
> added in Linux 5.11 causes this.
> 
> If this is indeed the case, this would be a regression in userspace. Is 
> there a generic fix or should the change be reverted?

>From the look at the links you posted, this appears to be completely
broken assumptions by Go that some registers don't change while calling
what essentially are external library functions *while inside those
functions* (ie in this case from a signal handler).

I suppose it would be possible to build the VDSO with gcc arguments to
make it not use r30, but that's just gross...

Cheers,
Ben.