Re: [RESEND PATCH v4 10/11] powerpc: Protect patching_mm with a lock

2021-05-07 Thread Peter Zijlstra
On Fri, May 07, 2021 at 03:03:51PM -0500, Christopher M. Riedl wrote:
> On Thu May 6, 2021 at 5:51 AM CDT, Peter Zijlstra wrote:
> > On Wed, May 05, 2021 at 11:34:51PM -0500, Christopher M. Riedl wrote:
> > > Powerpc allows for multiple CPUs to patch concurrently. When patching
> > > with STRICT_KERNEL_RWX a single patching_mm is allocated for use by all
> > > CPUs for the few times that patching occurs. Use a spinlock to protect
> > > the patching_mm from concurrent use.
> > > 
> > > Modify patch_instruction() to acquire the lock, perform the patch op,
> > > and then release the lock.
> > > 
> > > Also introduce {lock,unlock}_patching() along with
> > > patch_instruction_unlocked() to avoid per-iteration lock overhead when
> > > patch_instruction() is called in a loop. A follow-up patch converts some
> > > uses of patch_instruction() to use patch_instruction_unlocked() instead.
> >
> > x86 uses text_mutex for all this, why not do the same?
> 
> I wasn't entirely sure if there is a problem with potentially going to
> sleep in some of the places where patch_instruction() is called - the
> spinlock avoids that (hypothetical) problem.

So I'm not saying you like have to do this; but I did wonder if there's
a reason not to, and given you didn't mention it, I had to ask.

> I just tried switching to text_mutex and at least on a P9 machine the
> series boots w/ the Hash and Radix MMUs (with some lockdep errors). I
> can rework this in the next version to use text_mutex if I don't find
> any new problems with more extensive testing. It does mean more changes
> to use patch_instruction_unlocked() in kprobe/optprobe/ftace in
> arch/powerpc since iirc those are called with text_mutex already held.

The x86 text_poke() has a lockdep_assert_held(_mutex) in to make
sure nobody 'forgets' :-)


Re: [RESEND PATCH v4 10/11] powerpc: Protect patching_mm with a lock

2021-05-07 Thread Christopher M. Riedl
On Thu May 6, 2021 at 5:51 AM CDT, Peter Zijlstra wrote:
> On Wed, May 05, 2021 at 11:34:51PM -0500, Christopher M. Riedl wrote:
> > Powerpc allows for multiple CPUs to patch concurrently. When patching
> > with STRICT_KERNEL_RWX a single patching_mm is allocated for use by all
> > CPUs for the few times that patching occurs. Use a spinlock to protect
> > the patching_mm from concurrent use.
> > 
> > Modify patch_instruction() to acquire the lock, perform the patch op,
> > and then release the lock.
> > 
> > Also introduce {lock,unlock}_patching() along with
> > patch_instruction_unlocked() to avoid per-iteration lock overhead when
> > patch_instruction() is called in a loop. A follow-up patch converts some
> > uses of patch_instruction() to use patch_instruction_unlocked() instead.
>
> x86 uses text_mutex for all this, why not do the same?

I wasn't entirely sure if there is a problem with potentially going to
sleep in some of the places where patch_instruction() is called - the
spinlock avoids that (hypothetical) problem.

I just tried switching to text_mutex and at least on a P9 machine the
series boots w/ the Hash and Radix MMUs (with some lockdep errors). I
can rework this in the next version to use text_mutex if I don't find
any new problems with more extensive testing. It does mean more changes
to use patch_instruction_unlocked() in kprobe/optprobe/ftace in
arch/powerpc since iirc those are called with text_mutex already held.

Thanks!
Chris R.


Re: [RESEND PATCH v4 10/11] powerpc: Protect patching_mm with a lock

2021-05-06 Thread Peter Zijlstra
On Wed, May 05, 2021 at 11:34:51PM -0500, Christopher M. Riedl wrote:
> Powerpc allows for multiple CPUs to patch concurrently. When patching
> with STRICT_KERNEL_RWX a single patching_mm is allocated for use by all
> CPUs for the few times that patching occurs. Use a spinlock to protect
> the patching_mm from concurrent use.
> 
> Modify patch_instruction() to acquire the lock, perform the patch op,
> and then release the lock.
> 
> Also introduce {lock,unlock}_patching() along with
> patch_instruction_unlocked() to avoid per-iteration lock overhead when
> patch_instruction() is called in a loop. A follow-up patch converts some
> uses of patch_instruction() to use patch_instruction_unlocked() instead.

x86 uses text_mutex for all this, why not do the same?