Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-10-23 Thread Dmitry Safonov
Hi Christophe, Will,

On 10/23/20 12:57 PM, Christophe Leroy wrote:
> 
> 
> Le 23/10/2020 à 13:25, Will Deacon a écrit :
>> On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:
>>> Hi Dmitry,
[..]
>>> I haven't seen the patches, did you sent them out finally ?

I was working on .close() hook, but while cooking it, I thought it may
be better to make tracking of user landing generic. Note that the vdso
base address is mostly needed by kernel as an address to land in
userspace after processing a signal.

I have some raw patches that add
+#ifdef CONFIG_ARCH_HAS_USER_LANDING
+   struct vm_area_struct *user_landing;
+#endif
inside mm_struct and I plan to finish them after rc1 gets released.

While working on that, I noticed that arm32 and some other architectures
track vdso position in mm.context with the only reason to add
AT_SYSINFO_EHDR in the elf header that's being loaded. That's quite
overkill to have a pointer in mm.context that rather can be a local
variable in elf binfmt loader. Also, I found some issues with mremap
code. The patches series mentioned are at the base of the branch with
generic user landing. I have sent only those patches not the full branch
as I remember there was a policy that during merge window one should
send only fixes, rather than refactoring/new code.

>> I think it's this series:
>>
>> https://lore.kernel.org/r/20201013013416.390574-1-d...@arista.com
>>
>> but they look really invasive to me, so I may cook a small hack for arm64
>> in the meantine / for stable.

I don't mind small hacks, but I'm concerned that the suggested fix which
sets `mm->context.vdso_base = 0` on munmap() may have it's issue: that
won't work if a user for whatever-broken-reason will mremap() vdso on 0
address. As the fix supposes to fix an issue that hasn't fired for
anyone yet, it probably shouldn't introduce another. That's why I've
used vm_area_struct to track vdso position in the patches set.
Probably, temporary, you could use something like:
#define BAD_VDSO_ADDRESS(-1)UL
Or non-page-aligned address.
But the signal code that checks if it can land on vdso/sigpage should be
also aligned with the new definition.

> Not sure we are talking about the same thing.
> 
> I can't see any new .close function added to vm_special_mapping in order
> to replace arch_unmap() hook.
Thanks,
  Dmitry


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-10-23 Thread Christophe Leroy




Le 23/10/2020 à 13:25, Will Deacon a écrit :

On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:

Hi Dmitry,

Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :

On 9/27/20 8:43 AM, Christophe Leroy wrote:



Le 21/09/2020 à 13:26, Will Deacon a écrit :

On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:

Dmitry Safonov <0x7f454...@gmail.com> writes:

[..]

I'll cook a patch for vm_special_mapping if you don't mind :-)


That would be great, thanks!


I lost track of this one. Is there a patch kicking around to resolve
this,
or is the segfault expected behaviour?



IIUC dmitry said he will cook a patch. I have not seen any patch yet.


Yes, sorry about the delay - I was a bit busy with xfrm patches.

I'll send patches for .close() this week, working on them now.


I haven't seen the patches, did you sent them out finally ?


I think it's this series:

https://lore.kernel.org/r/20201013013416.390574-1-d...@arista.com

but they look really invasive to me, so I may cook a small hack for arm64
in the meantine / for stable.



Not sure we are talking about the same thing.

I can't see any new .close function added to vm_special_mapping in order to 
replace arch_unmap() hook.

Christophe


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-10-23 Thread Will Deacon
On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:
> Hi Dmitry,
> 
> Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
> > On 9/27/20 8:43 AM, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 21/09/2020 à 13:26, Will Deacon a écrit :
> > > > On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
> > > > > Dmitry Safonov <0x7f454...@gmail.com> writes:
> > [..]
> > > > > > I'll cook a patch for vm_special_mapping if you don't mind :-)
> > > > > 
> > > > > That would be great, thanks!
> > > > 
> > > > I lost track of this one. Is there a patch kicking around to resolve
> > > > this,
> > > > or is the segfault expected behaviour?
> > > > 
> > > 
> > > IIUC dmitry said he will cook a patch. I have not seen any patch yet.
> > 
> > Yes, sorry about the delay - I was a bit busy with xfrm patches.
> > 
> > I'll send patches for .close() this week, working on them now.
> 
> I haven't seen the patches, did you sent them out finally ?

I think it's this series:

https://lore.kernel.org/r/20201013013416.390574-1-d...@arista.com

but they look really invasive to me, so I may cook a small hack for arm64
in the meantine / for stable.

Will


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-10-23 Thread Christophe Leroy

Hi Dmitry,

Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :

On 9/27/20 8:43 AM, Christophe Leroy wrote:



Le 21/09/2020 à 13:26, Will Deacon a écrit :

On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:

Dmitry Safonov <0x7f454...@gmail.com> writes:

[..]

I'll cook a patch for vm_special_mapping if you don't mind :-)


That would be great, thanks!


I lost track of this one. Is there a patch kicking around to resolve
this,
or is the segfault expected behaviour?



IIUC dmitry said he will cook a patch. I have not seen any patch yet.


Yes, sorry about the delay - I was a bit busy with xfrm patches.

I'll send patches for .close() this week, working on them now.


I haven't seen the patches, did you sent them out finally ?

Christophe


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-09-28 Thread Dmitry Safonov
On 9/27/20 8:43 AM, Christophe Leroy wrote:
> 
> 
> Le 21/09/2020 à 13:26, Will Deacon a écrit :
>> On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
>>> Dmitry Safonov <0x7f454...@gmail.com> writes:
[..]
 I'll cook a patch for vm_special_mapping if you don't mind :-)
>>>
>>> That would be great, thanks!
>>
>> I lost track of this one. Is there a patch kicking around to resolve
>> this,
>> or is the segfault expected behaviour?
>>
> 
> IIUC dmitry said he will cook a patch. I have not seen any patch yet.

Yes, sorry about the delay - I was a bit busy with xfrm patches.

I'll send patches for .close() this week, working on them now.

> AFAIKS, among the architectures having VDSO sigreturn trampolines, only
> SH, X86 and POWERPC provide alternative trampoline on stack when VDSO is
> not there.
> 
> All other architectures just having a VDSO don't expect VDSO to not be
> mapped.
> 
> As far as nowadays stacks are mapped non-executable, getting a segfaut
> is expected behaviour. However, I think we should really make it
> cleaner. Today it segfaults because it is still pointing to the VDSO
> trampoline that has been unmapped. But should the user map some other
> code at the same address, we'll run in the weed on signal return instead
> of segfaulting.

+1.

> So VDSO unmapping should really be properly managed, the reference
> should be properly cleared in order to segfault in a controllable manner.
> 
> Only powerpc has a hook to properly clear the VDSO pointer when VDSO is
> unmapped.

Thanks,
 Dmitry


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-09-27 Thread Christophe Leroy




Le 21/09/2020 à 13:26, Will Deacon a écrit :

On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:

Dmitry Safonov <0x7f454...@gmail.com> writes:

On Wed, 26 Aug 2020 at 15:39, Michael Ellerman  wrote:

Christophe Leroy  writes:
We added a test for vdso unmap recently because it happened to trigger a
KAUP failure, and someone actually hit it & reported it.


You right, CRIU cares much more about moving vDSO.
It's done for each restoree and as on most setups vDSO is premapped and
used by the application - it's actively tested.
Speaking about vDSO unmap - that's concerning only for heterogeneous C/R,
i.e when an application is migrated from a system that uses vDSO to the one
which doesn't - it's much rare scenario.
(for arm it's !CONFIG_VDSO, for x86 it's `vdso=0` boot parameter)


Ah OK that explains it.

The case we hit of VDSO unmapping was some strange "library OS" thing
which had explicitly unmapped the VDSO, so also very rare.


Looking at the code, it seems quite easy to provide/maintain .close() for
vm_special_mapping. A bit harder to add a test from CRIU side
(as glibc won't know on restore that it can't use vdso anymore),
but totally not impossible.


Running that test on arm64 segfaults:

   # ./sigreturn_vdso
   VDSO is at 0x8191f000-0x8191 (4096 bytes)
   Signal delivered OK with VDSO mapped
   VDSO moved to 0x8191a000-0x8191afff (4096 bytes)
   Signal delivered OK with VDSO moved
   Unmapped VDSO
   Remapped the stack executable
   [   48.556191] potentially unexpected fatal signal 11.
   [   48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 
5.9.0-rc2-00057-g2ac69819ba9e #190
   [   48.556990] Hardware name: linux,dummy-virt (DT)
   [   48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
   [   48.557475] pc : 8191a7bc
   [   48.557603] lr : 8191a7bc
   [   48.557697] sp : c13c9e90
   [   48.557873] x29: c13cb0e0 x28: 
   [   48.558201] x27:  x26: 
   [   48.558337] x25:  x24: 
   [   48.558754] x23:  x22: 
   [   48.558893] x21: 004009b0 x20: 
   [   48.559046] x19: 00400ff0 x18: 
   [   48.559180] x17: 817da300 x16: 00412010
   [   48.559312] x15:  x14: 001c
   [   48.559443] x13: 656c626174756365 x12: 7865206b63617473
   [   48.559625] x11: 0003 x10: 0101010101010101
   [   48.559828] x9 : 818afda8 x8 : 0081
   [   48.559973] x7 : 6174732065687420 x6 : 64657070616d6552
   [   48.560115] x5 : 0e0388bd x4 : 0040135d
   [   48.560270] x3 :  x2 : 0001
   [   48.560412] x1 : 0003 x0 : 004120b8
   Segmentation fault
   #

So I think we need to keep the unmap hook. Maybe it should be handled by
the special_mapping stuff generically.


I'll cook a patch for vm_special_mapping if you don't mind :-)


That would be great, thanks!


I lost track of this one. Is there a patch kicking around to resolve this,
or is the segfault expected behaviour?



IIUC dmitry said he will cook a patch. I have not seen any patch yet.

AFAIKS, among the architectures having VDSO sigreturn trampolines, only SH, X86 and POWERPC provide 
alternative trampoline on stack when VDSO is not there.


All other architectures just having a VDSO don't expect VDSO to not be mapped.

As far as nowadays stacks are mapped non-executable, getting a segfaut is expected behaviour. 
However, I think we should really make it cleaner. Today it segfaults because it is still pointing 
to the VDSO trampoline that has been unmapped. But should the user map some other code at the same 
address, we'll run in the weed on signal return instead of segfaulting.


So VDSO unmapping should really be properly managed, the reference should be properly cleared in 
order to segfault in a controllable manner.


Only powerpc has a hook to properly clear the VDSO pointer when VDSO is 
unmapped.

Christophe


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-09-21 Thread Will Deacon
On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
> Dmitry Safonov <0x7f454...@gmail.com> writes:
> > On Wed, 26 Aug 2020 at 15:39, Michael Ellerman  wrote:
> >> Christophe Leroy  writes:
> >> We added a test for vdso unmap recently because it happened to trigger a
> >> KAUP failure, and someone actually hit it & reported it.
> >
> > You right, CRIU cares much more about moving vDSO.
> > It's done for each restoree and as on most setups vDSO is premapped and
> > used by the application - it's actively tested.
> > Speaking about vDSO unmap - that's concerning only for heterogeneous C/R,
> > i.e when an application is migrated from a system that uses vDSO to the one
> > which doesn't - it's much rare scenario.
> > (for arm it's !CONFIG_VDSO, for x86 it's `vdso=0` boot parameter)
> 
> Ah OK that explains it.
> 
> The case we hit of VDSO unmapping was some strange "library OS" thing
> which had explicitly unmapped the VDSO, so also very rare.
> 
> > Looking at the code, it seems quite easy to provide/maintain .close() for
> > vm_special_mapping. A bit harder to add a test from CRIU side
> > (as glibc won't know on restore that it can't use vdso anymore),
> > but totally not impossible.
> >
> >> Running that test on arm64 segfaults:
> >>
> >>   # ./sigreturn_vdso
> >>   VDSO is at 0x8191f000-0x8191 (4096 bytes)
> >>   Signal delivered OK with VDSO mapped
> >>   VDSO moved to 0x8191a000-0x8191afff (4096 bytes)
> >>   Signal delivered OK with VDSO moved
> >>   Unmapped VDSO
> >>   Remapped the stack executable
> >>   [   48.556191] potentially unexpected fatal signal 11.
> >>   [   48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 
> >> 5.9.0-rc2-00057-g2ac69819ba9e #190
> >>   [   48.556990] Hardware name: linux,dummy-virt (DT)
> >>   [   48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
> >>   [   48.557475] pc : 8191a7bc
> >>   [   48.557603] lr : 8191a7bc
> >>   [   48.557697] sp : c13c9e90
> >>   [   48.557873] x29: c13cb0e0 x28: 
> >>   [   48.558201] x27:  x26: 
> >>   [   48.558337] x25:  x24: 
> >>   [   48.558754] x23:  x22: 
> >>   [   48.558893] x21: 004009b0 x20: 
> >>   [   48.559046] x19: 00400ff0 x18: 
> >>   [   48.559180] x17: 817da300 x16: 00412010
> >>   [   48.559312] x15:  x14: 001c
> >>   [   48.559443] x13: 656c626174756365 x12: 7865206b63617473
> >>   [   48.559625] x11: 0003 x10: 0101010101010101
> >>   [   48.559828] x9 : 818afda8 x8 : 0081
> >>   [   48.559973] x7 : 6174732065687420 x6 : 64657070616d6552
> >>   [   48.560115] x5 : 0e0388bd x4 : 0040135d
> >>   [   48.560270] x3 :  x2 : 0001
> >>   [   48.560412] x1 : 0003 x0 : 004120b8
> >>   Segmentation fault
> >>   #
> >>
> >> So I think we need to keep the unmap hook. Maybe it should be handled by
> >> the special_mapping stuff generically.
> >
> > I'll cook a patch for vm_special_mapping if you don't mind :-)
> 
> That would be great, thanks!

I lost track of this one. Is there a patch kicking around to resolve this,
or is the segfault expected behaviour?

Will


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-08-27 Thread Michael Ellerman
Dmitry Safonov <0x7f454...@gmail.com> writes:
> Hello,
>
> On Wed, 26 Aug 2020 at 15:39, Michael Ellerman  wrote:
>> Christophe Leroy  writes:
> [..]
>> > arch_remap() gets replaced by vdso_remap()
>> >
>> > For arch_unmap(), I'm wondering how/what other architectures do, because
>> > powerpc seems to be the only one to erase the vdso context pointer when
>> > unmapping the vdso.
>>
>> Yeah. The original unmap/remap stuff was added for CRIU, which I thought
>> people tested on other architectures (more than powerpc even).
>>
>> Possibly no one really cares about vdso unmap though, vs just moving the
>> vdso.
>>
>> We added a test for vdso unmap recently because it happened to trigger a
>> KAUP failure, and someone actually hit it & reported it.
>
> You right, CRIU cares much more about moving vDSO.
> It's done for each restoree and as on most setups vDSO is premapped and
> used by the application - it's actively tested.
> Speaking about vDSO unmap - that's concerning only for heterogeneous C/R,
> i.e when an application is migrated from a system that uses vDSO to the one
> which doesn't - it's much rare scenario.
> (for arm it's !CONFIG_VDSO, for x86 it's `vdso=0` boot parameter)

Ah OK that explains it.

The case we hit of VDSO unmapping was some strange "library OS" thing
which had explicitly unmapped the VDSO, so also very rare.

> Looking at the code, it seems quite easy to provide/maintain .close() for
> vm_special_mapping. A bit harder to add a test from CRIU side
> (as glibc won't know on restore that it can't use vdso anymore),
> but totally not impossible.
>
>> Running that test on arm64 segfaults:
>>
>>   # ./sigreturn_vdso
>>   VDSO is at 0x8191f000-0x8191 (4096 bytes)
>>   Signal delivered OK with VDSO mapped
>>   VDSO moved to 0x8191a000-0x8191afff (4096 bytes)
>>   Signal delivered OK with VDSO moved
>>   Unmapped VDSO
>>   Remapped the stack executable
>>   [   48.556191] potentially unexpected fatal signal 11.
>>   [   48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 
>> 5.9.0-rc2-00057-g2ac69819ba9e #190
>>   [   48.556990] Hardware name: linux,dummy-virt (DT)
>>   [   48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
>>   [   48.557475] pc : 8191a7bc
>>   [   48.557603] lr : 8191a7bc
>>   [   48.557697] sp : c13c9e90
>>   [   48.557873] x29: c13cb0e0 x28: 
>>   [   48.558201] x27:  x26: 
>>   [   48.558337] x25:  x24: 
>>   [   48.558754] x23:  x22: 
>>   [   48.558893] x21: 004009b0 x20: 
>>   [   48.559046] x19: 00400ff0 x18: 
>>   [   48.559180] x17: 817da300 x16: 00412010
>>   [   48.559312] x15:  x14: 001c
>>   [   48.559443] x13: 656c626174756365 x12: 7865206b63617473
>>   [   48.559625] x11: 0003 x10: 0101010101010101
>>   [   48.559828] x9 : 818afda8 x8 : 0081
>>   [   48.559973] x7 : 6174732065687420 x6 : 64657070616d6552
>>   [   48.560115] x5 : 0e0388bd x4 : 0040135d
>>   [   48.560270] x3 :  x2 : 0001
>>   [   48.560412] x1 : 0003 x0 : 004120b8
>>   Segmentation fault
>>   #
>>
>> So I think we need to keep the unmap hook. Maybe it should be handled by
>> the special_mapping stuff generically.
>
> I'll cook a patch for vm_special_mapping if you don't mind :-)

That would be great, thanks!

cheers


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-08-27 Thread Dmitry Safonov
Hello,

On Wed, 26 Aug 2020 at 15:39, Michael Ellerman  wrote:
> Christophe Leroy  writes:
[..]
> > arch_remap() gets replaced by vdso_remap()
> >
> > For arch_unmap(), I'm wondering how/what other architectures do, because
> > powerpc seems to be the only one to erase the vdso context pointer when
> > unmapping the vdso.
>
> Yeah. The original unmap/remap stuff was added for CRIU, which I thought
> people tested on other architectures (more than powerpc even).
>
> Possibly no one really cares about vdso unmap though, vs just moving the
> vdso.
>
> We added a test for vdso unmap recently because it happened to trigger a
> KAUP failure, and someone actually hit it & reported it.

You right, CRIU cares much more about moving vDSO.
It's done for each restoree and as on most setups vDSO is premapped and
used by the application - it's actively tested.
Speaking about vDSO unmap - that's concerning only for heterogeneous C/R,
i.e when an application is migrated from a system that uses vDSO to the one
which doesn't - it's much rare scenario.
(for arm it's !CONFIG_VDSO, for x86 it's `vdso=0` boot parameter)

Looking at the code, it seems quite easy to provide/maintain .close() for
vm_special_mapping. A bit harder to add a test from CRIU side
(as glibc won't know on restore that it can't use vdso anymore),
but totally not impossible.

> Running that test on arm64 segfaults:
>
>   # ./sigreturn_vdso
>   VDSO is at 0x8191f000-0x8191 (4096 bytes)
>   Signal delivered OK with VDSO mapped
>   VDSO moved to 0x8191a000-0x8191afff (4096 bytes)
>   Signal delivered OK with VDSO moved
>   Unmapped VDSO
>   Remapped the stack executable
>   [   48.556191] potentially unexpected fatal signal 11.
>   [   48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 
> 5.9.0-rc2-00057-g2ac69819ba9e #190
>   [   48.556990] Hardware name: linux,dummy-virt (DT)
>   [   48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
>   [   48.557475] pc : 8191a7bc
>   [   48.557603] lr : 8191a7bc
>   [   48.557697] sp : c13c9e90
>   [   48.557873] x29: c13cb0e0 x28: 
>   [   48.558201] x27:  x26: 
>   [   48.558337] x25:  x24: 
>   [   48.558754] x23:  x22: 
>   [   48.558893] x21: 004009b0 x20: 
>   [   48.559046] x19: 00400ff0 x18: 
>   [   48.559180] x17: 817da300 x16: 00412010
>   [   48.559312] x15:  x14: 001c
>   [   48.559443] x13: 656c626174756365 x12: 7865206b63617473
>   [   48.559625] x11: 0003 x10: 0101010101010101
>   [   48.559828] x9 : 818afda8 x8 : 0081
>   [   48.559973] x7 : 6174732065687420 x6 : 64657070616d6552
>   [   48.560115] x5 : 0e0388bd x4 : 0040135d
>   [   48.560270] x3 :  x2 : 0001
>   [   48.560412] x1 : 0003 x0 : 004120b8
>   Segmentation fault
>   #
>
> So I think we need to keep the unmap hook. Maybe it should be handled by
> the special_mapping stuff generically.

I'll cook a patch for vm_special_mapping if you don't mind :-)

Thanks,
 Dmitry


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-08-26 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 04/08/2020 à 13:17, Christophe Leroy a écrit :
>> On 07/16/2020 02:59 AM, Michael Ellerman wrote:
>>> Christophe Leroy  writes:
 The VDSO datapage and the text pages are always located immediately
 next to each other, so it can be hardcoded without an indirection
 through __kernel_datapage_offset

 In order to ease things, move the data page in front like other
 arches, that way there is no need to know the size of the library
 to locate the data page.

> [...]
>
>>>
>>> I merged this but then realised it breaks the display of the vdso in 
>>> /proc/self/maps.
>>>
>>> ie. the vdso vma gets no name:
>>>
>>>    # cat /proc/self/maps
>
> [...]
>
>>> And it's also going to break the logic in arch_unmap() to detect if
>>> we're unmapping (part of) the VDSO. And it will break arch_remap() too.
>>>
>>> And the logic to recognise the signal trampoline in
>>> arch/powerpc/perf/callchain_*.c as well.
>> 
>> I don't think it breaks that one, because ->vdsobase is still the start 
>> of text.
>> 
>>>
>>> So I'm going to rebase and drop this for now.
>>>
>>> Basically we have a bunch of places that assume that vdso_base is == the
>>> start of the VDSO vma, and also that the code starts there. So that will
>>> need some work to tease out all those assumptions and make them work
>>> with this change.
>> 
>> Ok, one day I need to look at it in more details and see how other 
>> architectures handle it etc ...
>> 
>
> I just sent out a series which switches powerpc to the new 
> _install_special_mapping() API, the one powerpc uses being deprecated 
> since commit a62c34bd2a8a ("x86, mm: Improve _install_special_mapping
> and fix x86 vdso naming")
>
> arch_remap() gets replaced by vdso_remap()
>
> For arch_unmap(), I'm wondering how/what other architectures do, because 
> powerpc seems to be the only one to erase the vdso context pointer when 
> unmapping the vdso.

Yeah. The original unmap/remap stuff was added for CRIU, which I thought
people tested on other architectures (more than powerpc even).

Possibly no one really cares about vdso unmap though, vs just moving the
vdso.

We added a test for vdso unmap recently because it happened to trigger a
KAUP failure, and someone actually hit it & reported it.

Running that test on arm64 segfaults:

  # ./sigreturn_vdso 
  VDSO is at 0x8191f000-0x8191 (4096 bytes)
  Signal delivered OK with VDSO mapped
  VDSO moved to 0x8191a000-0x8191afff (4096 bytes)
  Signal delivered OK with VDSO moved
  Unmapped VDSO
  Remapped the stack executable
  [   48.556191] potentially unexpected fatal signal 11.
  [   48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 
5.9.0-rc2-00057-g2ac69819ba9e #190
  [   48.556990] Hardware name: linux,dummy-virt (DT)
  [   48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
  [   48.557475] pc : 8191a7bc
  [   48.557603] lr : 8191a7bc
  [   48.557697] sp : c13c9e90
  [   48.557873] x29: c13cb0e0 x28:  
  [   48.558201] x27:  x26:  
  [   48.558337] x25:  x24:  
  [   48.558754] x23:  x22:  
  [   48.558893] x21: 004009b0 x20:  
  [   48.559046] x19: 00400ff0 x18:  
  [   48.559180] x17: 817da300 x16: 00412010 
  [   48.559312] x15:  x14: 001c 
  [   48.559443] x13: 656c626174756365 x12: 7865206b63617473 
  [   48.559625] x11: 0003 x10: 0101010101010101 
  [   48.559828] x9 : 818afda8 x8 : 0081 
  [   48.559973] x7 : 6174732065687420 x6 : 64657070616d6552 
  [   48.560115] x5 : 0e0388bd x4 : 0040135d 
  [   48.560270] x3 :  x2 : 0001 
  [   48.560412] x1 : 0003 x0 : 004120b8 
  Segmentation fault
  #

So I think we need to keep the unmap hook. Maybe it should be handled by
the special_mapping stuff generically.

cheers


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-08-25 Thread Christophe Leroy




Le 04/08/2020 à 13:17, Christophe Leroy a écrit :



On 07/16/2020 02:59 AM, Michael Ellerman wrote:

Christophe Leroy  writes:

The VDSO datapage and the text pages are always located immediately
next to each other, so it can be hardcoded without an indirection
through __kernel_datapage_offset

In order to ease things, move the data page in front like other
arches, that way there is no need to know the size of the library
to locate the data page.


[...]



I merged this but then realised it breaks the display of the vdso in 
/proc/self/maps.


ie. the vdso vma gets no name:

   # cat /proc/self/maps


[...]




And it's also going to break the logic in arch_unmap() to detect if
we're unmapping (part of) the VDSO. And it will break arch_remap() too.

And the logic to recognise the signal trampoline in
arch/powerpc/perf/callchain_*.c as well.


I don't think it breaks that one, because ->vdsobase is still the start 
of text.




So I'm going to rebase and drop this for now.

Basically we have a bunch of places that assume that vdso_base is == the
start of the VDSO vma, and also that the code starts there. So that will
need some work to tease out all those assumptions and make them work
with this change.


Ok, one day I need to look at it in more details and see how other 
architectures handle it etc ...




I just sent out a series which switches powerpc to the new 
_install_special_mapping() API, the one powerpc uses being deprecated 
since commit a62c34bd2a8a ("x86, mm: Improve _install_special_mapping

and fix x86 vdso naming")

arch_remap() gets replaced by vdso_remap()

For arch_unmap(), I'm wondering how/what other architectures do, because 
powerpc seems to be the only one to erase the vdso context pointer when 
unmapping the vdso. So far I updated it to take into account the pages 
switch.


Everything else is not impacted because our vdso_base is still the base 
of the text and that's what those things (signal trampoline, callchain, 
...) expect.


Maybe we should change it to 'void *vdso' in the same way as other 
architectures, as it is not anymore the exact vdso_base but the start of 
VDSO text.


Note that the series applies on top of the generic C VDSO implementation 
series. However all but the last commit cleanly apply without that 
series. As that last commit is just an afterwork cleanup, it can come in 
a second step.


Christophe


Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-08-04 Thread Christophe Leroy




On 07/16/2020 02:59 AM, Michael Ellerman wrote:

Christophe Leroy  writes:

The VDSO datapage and the text pages are always located immediately
next to each other, so it can be hardcoded without an indirection
through __kernel_datapage_offset

In order to ease things, move the data page in front like other
arches, that way there is no need to know the size of the library
to locate the data page.

Before:
clock-getres-realtime-coarse:vdso: 714 nsec/call
clock-gettime-realtime-coarse:vdso: 792 nsec/call
clock-gettime-realtime:vdso: 1243 nsec/call

After:
clock-getres-realtime-coarse:vdso: 699 nsec/call
clock-gettime-realtime-coarse:vdso: 784 nsec/call
clock-gettime-realtime:vdso: 1231 nsec/call

Signed-off-by: Christophe Leroy 
---
v7:
- Moved the removal of the tmp param of __get_datapage()
in a subsequent patch
- Included the addition of the offset param to __get_datapage()
in the further preparatory patch
---
  arch/powerpc/include/asm/vdso_datapage.h |  8 ++--
  arch/powerpc/kernel/vdso.c   | 53 
  arch/powerpc/kernel/vdso32/datapage.S|  3 --
  arch/powerpc/kernel/vdso32/vdso32.lds.S  |  7 +---
  arch/powerpc/kernel/vdso64/datapage.S|  3 --
  arch/powerpc/kernel/vdso64/vdso64.lds.S  |  7 +---
  6 files changed, 16 insertions(+), 65 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso_datapage.h 
b/arch/powerpc/include/asm/vdso_datapage.h
index b9ef6cf50ea5..11886467dfdf 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -118,10 +118,12 @@ extern struct vdso_data *vdso_data;
  
  .macro get_datapage ptr, tmp

bcl 20, 31, .+4
+999:
mflr\ptr
-   addi\ptr, \ptr, (__kernel_datapage_offset - (.-4))@l
-   lwz \tmp, 0(\ptr)
-   add \ptr, \tmp, \ptr
+#if CONFIG_PPC_PAGE_SHIFT > 14
+   addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
+#endif
+   addi\ptr, \ptr, (_vdso_datapage - 999b)@l
  .endm
  
  #endif /* __ASSEMBLY__ */

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index f38f26e844b6..d33fa22ddbed 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -190,7 +190,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, 
int uses_interp)
 * install_special_mapping or the perf counter mmap tracking code
 * will fail to recognise it as a vDSO (since arch_vma_name fails).
 */
-   current->mm->context.vdso_base = vdso_base;
+   current->mm->context.vdso_base = vdso_base + PAGE_SIZE;


I merged this but then realised it breaks the display of the vdso in 
/proc/self/maps.

ie. the vdso vma gets no name:

   # cat /proc/self/maps
   110f9-110fa r-xp  08:03 17021844 
/usr/bin/cat
   110fa-110fb r--p  08:03 17021844 
/usr/bin/cat
   110fb-110fc rw-p 0001 08:03 17021844 
/usr/bin/cat
   12600-12603 rw-p  00:00 0
[heap]
   7fffa879-7fffa87d rw-p  00:00 0
   7fffa87d-7fffa883 r--p  08:03 17521786   
/usr/lib/locale/en_AU.utf8/LC_CTYPE
   7fffa883-7fffa884 r--p  08:03 16958337   
/usr/lib/locale/en_AU.utf8/LC_NUMERIC
   7fffa884-7fffa885 r--p  08:03 8501358
/usr/lib/locale/en_AU.utf8/LC_TIME
   7fffa885-7fffa8ad r--p  08:03 16870886   
/usr/lib/locale/en_AU.utf8/LC_COLLATE
   7fffa8ad-7fffa8ae r--p  08:03 8509433
/usr/lib/locale/en_AU.utf8/LC_MONETARY
   7fffa8ae-7fffa8af r--p  08:03 25383753   
/usr/lib/locale/en_AU.utf8/LC_MESSAGES/SYS_LC_MESSAGES
   7fffa8af-7fffa8b0 r--p  08:03 17521790   
/usr/lib/locale/en_AU.utf8/LC_PAPER
   7fffa8b0-7fffa8b1 r--p  08:03 8501354
/usr/lib/locale/en_AU.utf8/LC_NAME
   7fffa8b1-7fffa8b2 r--p  08:03 8509431
/usr/lib/locale/en_AU.utf8/LC_ADDRESS
   7fffa8b2-7fffa8b3 r--p  08:03 8509434
/usr/lib/locale/en_AU.utf8/LC_TELEPHONE
   7fffa8b3-7fffa8b4 r--p  08:03 17521787   
/usr/lib/locale/en_AU.utf8/LC_MEASUREMENT
   7fffa8b4-7fffa8b5 r--s  08:03 25623315   
/usr/lib64/gconv/gconv-modules.cache
   7fffa8b5-7fffa8d4 r-xp  08:03 25383789   
/usr/lib64/libc-2.30.so
   7fffa8d4-7fffa8d5 r--p 001e 08:03 25383789   
/usr/lib64/libc-2.30.so
   7fffa8d5-7fffa8d6 rw-p 001f 08:03 25383789   
/usr/lib64/libc-2.30.so
   7fffa8d6-7fffa8d7 r--p  08:03 8509432
/usr/lib/locale/en_AU.utf8/LC_IDENTIFICATION
   7fffa8d7-7fffa8d9 r-xp 

Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-07-15 Thread Michael Ellerman
Christophe Leroy  writes:
> The VDSO datapage and the text pages are always located immediately
> next to each other, so it can be hardcoded without an indirection
> through __kernel_datapage_offset
>
> In order to ease things, move the data page in front like other
> arches, that way there is no need to know the size of the library
> to locate the data page.
>
> Before:
> clock-getres-realtime-coarse:vdso: 714 nsec/call
> clock-gettime-realtime-coarse:vdso: 792 nsec/call
> clock-gettime-realtime:vdso: 1243 nsec/call
>
> After:
> clock-getres-realtime-coarse:vdso: 699 nsec/call
> clock-gettime-realtime-coarse:vdso: 784 nsec/call
> clock-gettime-realtime:vdso: 1231 nsec/call
>
> Signed-off-by: Christophe Leroy 
> ---
> v7:
> - Moved the removal of the tmp param of __get_datapage()
> in a subsequent patch
> - Included the addition of the offset param to __get_datapage()
> in the further preparatory patch
> ---
>  arch/powerpc/include/asm/vdso_datapage.h |  8 ++--
>  arch/powerpc/kernel/vdso.c   | 53 
>  arch/powerpc/kernel/vdso32/datapage.S|  3 --
>  arch/powerpc/kernel/vdso32/vdso32.lds.S  |  7 +---
>  arch/powerpc/kernel/vdso64/datapage.S|  3 --
>  arch/powerpc/kernel/vdso64/vdso64.lds.S  |  7 +---
>  6 files changed, 16 insertions(+), 65 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/vdso_datapage.h 
> b/arch/powerpc/include/asm/vdso_datapage.h
> index b9ef6cf50ea5..11886467dfdf 100644
> --- a/arch/powerpc/include/asm/vdso_datapage.h
> +++ b/arch/powerpc/include/asm/vdso_datapage.h
> @@ -118,10 +118,12 @@ extern struct vdso_data *vdso_data;
>  
>  .macro get_datapage ptr, tmp
>   bcl 20, 31, .+4
> +999:
>   mflr\ptr
> - addi\ptr, \ptr, (__kernel_datapage_offset - (.-4))@l
> - lwz \tmp, 0(\ptr)
> - add \ptr, \tmp, \ptr
> +#if CONFIG_PPC_PAGE_SHIFT > 14
> + addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
> +#endif
> + addi\ptr, \ptr, (_vdso_datapage - 999b)@l
>  .endm
>  
>  #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
> index f38f26e844b6..d33fa22ddbed 100644
> --- a/arch/powerpc/kernel/vdso.c
> +++ b/arch/powerpc/kernel/vdso.c
> @@ -190,7 +190,7 @@ int arch_setup_additional_pages(struct linux_binprm 
> *bprm, int uses_interp)
>* install_special_mapping or the perf counter mmap tracking code
>* will fail to recognise it as a vDSO (since arch_vma_name fails).
>*/
> - current->mm->context.vdso_base = vdso_base;
> + current->mm->context.vdso_base = vdso_base + PAGE_SIZE;

I merged this but then realised it breaks the display of the vdso in 
/proc/self/maps.

ie. the vdso vma gets no name:

  # cat /proc/self/maps
  110f9-110fa r-xp  08:03 17021844 
/usr/bin/cat
  110fa-110fb r--p  08:03 17021844 
/usr/bin/cat
  110fb-110fc rw-p 0001 08:03 17021844 
/usr/bin/cat
  12600-12603 rw-p  00:00 0
[heap]
  7fffa879-7fffa87d rw-p  00:00 0 
  7fffa87d-7fffa883 r--p  08:03 17521786   
/usr/lib/locale/en_AU.utf8/LC_CTYPE
  7fffa883-7fffa884 r--p  08:03 16958337   
/usr/lib/locale/en_AU.utf8/LC_NUMERIC
  7fffa884-7fffa885 r--p  08:03 8501358
/usr/lib/locale/en_AU.utf8/LC_TIME
  7fffa885-7fffa8ad r--p  08:03 16870886   
/usr/lib/locale/en_AU.utf8/LC_COLLATE
  7fffa8ad-7fffa8ae r--p  08:03 8509433
/usr/lib/locale/en_AU.utf8/LC_MONETARY
  7fffa8ae-7fffa8af r--p  08:03 25383753   
/usr/lib/locale/en_AU.utf8/LC_MESSAGES/SYS_LC_MESSAGES
  7fffa8af-7fffa8b0 r--p  08:03 17521790   
/usr/lib/locale/en_AU.utf8/LC_PAPER
  7fffa8b0-7fffa8b1 r--p  08:03 8501354
/usr/lib/locale/en_AU.utf8/LC_NAME
  7fffa8b1-7fffa8b2 r--p  08:03 8509431
/usr/lib/locale/en_AU.utf8/LC_ADDRESS
  7fffa8b2-7fffa8b3 r--p  08:03 8509434
/usr/lib/locale/en_AU.utf8/LC_TELEPHONE
  7fffa8b3-7fffa8b4 r--p  08:03 17521787   
/usr/lib/locale/en_AU.utf8/LC_MEASUREMENT
  7fffa8b4-7fffa8b5 r--s  08:03 25623315   
/usr/lib64/gconv/gconv-modules.cache
  7fffa8b5-7fffa8d4 r-xp  08:03 25383789   
/usr/lib64/libc-2.30.so
  7fffa8d4-7fffa8d5 r--p 001e 08:03 25383789   
/usr/lib64/libc-2.30.so
  7fffa8d5-7fffa8d6 rw-p 001f 08:03 25383789   
/usr/lib64/libc-2.30.so
  7fffa8d6-7fffa8d7 r--p  08:03 8509432
/usr/lib/locale/en_AU.utf8/LC_IDENTIFICATION
  7fffa8d7-7fffa8d9 

[PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-04-28 Thread Christophe Leroy
The VDSO datapage and the text pages are always located immediately
next to each other, so it can be hardcoded without an indirection
through __kernel_datapage_offset

In order to ease things, move the data page in front like other
arches, that way there is no need to know the size of the library
to locate the data page.

Before:
clock-getres-realtime-coarse:vdso: 714 nsec/call
clock-gettime-realtime-coarse:vdso: 792 nsec/call
clock-gettime-realtime:vdso: 1243 nsec/call

After:
clock-getres-realtime-coarse:vdso: 699 nsec/call
clock-gettime-realtime-coarse:vdso: 784 nsec/call
clock-gettime-realtime:vdso: 1231 nsec/call

Signed-off-by: Christophe Leroy 
---
v7:
- Moved the removal of the tmp param of __get_datapage()
in a subsequent patch
- Included the addition of the offset param to __get_datapage()
in the further preparatory patch
---
 arch/powerpc/include/asm/vdso_datapage.h |  8 ++--
 arch/powerpc/kernel/vdso.c   | 53 
 arch/powerpc/kernel/vdso32/datapage.S|  3 --
 arch/powerpc/kernel/vdso32/vdso32.lds.S  |  7 +---
 arch/powerpc/kernel/vdso64/datapage.S|  3 --
 arch/powerpc/kernel/vdso64/vdso64.lds.S  |  7 +---
 6 files changed, 16 insertions(+), 65 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso_datapage.h 
b/arch/powerpc/include/asm/vdso_datapage.h
index b9ef6cf50ea5..11886467dfdf 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -118,10 +118,12 @@ extern struct vdso_data *vdso_data;
 
 .macro get_datapage ptr, tmp
bcl 20, 31, .+4
+999:
mflr\ptr
-   addi\ptr, \ptr, (__kernel_datapage_offset - (.-4))@l
-   lwz \tmp, 0(\ptr)
-   add \ptr, \tmp, \ptr
+#if CONFIG_PPC_PAGE_SHIFT > 14
+   addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
+#endif
+   addi\ptr, \ptr, (_vdso_datapage - 999b)@l
 .endm
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index f38f26e844b6..d33fa22ddbed 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -190,7 +190,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, 
int uses_interp)
 * install_special_mapping or the perf counter mmap tracking code
 * will fail to recognise it as a vDSO (since arch_vma_name fails).
 */
-   current->mm->context.vdso_base = vdso_base;
+   current->mm->context.vdso_base = vdso_base + PAGE_SIZE;
 
/*
 * our vma flags don't have VM_WRITE so by default, the process isn't
@@ -482,42 +482,6 @@ static __init void vdso_setup_trampolines(struct 
lib32_elfinfo *v32,
vdso32_rt_sigtramp = find_function32(v32, "__kernel_sigtramp_rt32");
 }
 
-static __init int vdso_fixup_datapage(struct lib32_elfinfo *v32,
-  struct lib64_elfinfo *v64)
-{
-#ifdef CONFIG_VDSO32
-   Elf32_Sym *sym32;
-#endif
-#ifdef CONFIG_PPC64
-   Elf64_Sym *sym64;
-
-   sym64 = find_symbol64(v64, "__kernel_datapage_offset");
-   if (sym64 == NULL) {
-   printk(KERN_ERR "vDSO64: Can't find symbol "
-  "__kernel_datapage_offset !\n");
-   return -1;
-   }
-   *((int *)(vdso64_kbase + sym64->st_value - VDSO64_LBASE)) =
-   (vdso64_pages << PAGE_SHIFT) -
-   (sym64->st_value - VDSO64_LBASE);
-#endif /* CONFIG_PPC64 */
-
-#ifdef CONFIG_VDSO32
-   sym32 = find_symbol32(v32, "__kernel_datapage_offset");
-   if (sym32 == NULL) {
-   printk(KERN_ERR "vDSO32: Can't find symbol "
-  "__kernel_datapage_offset !\n");
-   return -1;
-   }
-   *((int *)(vdso32_kbase + (sym32->st_value - VDSO32_LBASE))) =
-   (vdso32_pages << PAGE_SHIFT) -
-   (sym32->st_value - VDSO32_LBASE);
-#endif
-
-   return 0;
-}
-
-
 static __init int vdso_fixup_features(struct lib32_elfinfo *v32,
  struct lib64_elfinfo *v64)
 {
@@ -618,9 +582,6 @@ static __init int vdso_setup(void)
if (vdso_do_find_sections(, ))
return -1;
 
-   if (vdso_fixup_datapage(, ))
-   return -1;
-
if (vdso_fixup_features(, ))
return -1;
 
@@ -761,26 +722,26 @@ static int __init vdso_init(void)
vdso32_pagelist = kcalloc(vdso32_pages + 2, sizeof(struct page *),
  GFP_KERNEL);
BUG_ON(vdso32_pagelist == NULL);
+   vdso32_pagelist[0] = virt_to_page(vdso_data);
for (i = 0; i < vdso32_pages; i++) {
struct page *pg = virt_to_page(vdso32_kbase + i*PAGE_SIZE);
get_page(pg);
-   vdso32_pagelist[i] = pg;
+   vdso32_pagelist[i + 1] = pg;
}
-   vdso32_pagelist[i++] = virt_to_page(vdso_data);
-   vdso32_pagelist[i] = NULL;
+   vdso32_pagelist[i + 1] = NULL;
 #endif
 
 #ifdef CONFIG_PPC64