Re: [PATCH v10 06/10] livepatch: Add atomic replace

Petr Mladek Thu, 22 Mar 2018 08:44:08 -0700

On Tue 2018-03-20 16:26:19, Josh Poimboeuf wrote:
> On Tue, Mar 20, 2018 at 03:35:01PM +0100, Petr Mladek wrote:
> > On Tue 2018-03-13 17:48:04, Josh Poimboeuf wrote:
> > > On Wed, Mar 07, 2018 at 09:20:35AM +0100, Petr Mladek wrote:
> > > > This patch adds a new "replace" flag to struct klp_patch. When it is
> > > > enabled, a set of 'nop' klp_func will be dynamically created for all
> > > > functions that are already being patched but that will no longer be
> > > > modified by the new patch. They are temporarily used as a new target
> > > > during the patch transition.
> > > > 
> > > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > > > index fd0296859ff4..ad508a86b2f9 100644
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> > > > +static int klp_add_nops(struct klp_patch *patch)
> > > > +{
> > > > +       struct klp_patch *old_patch;
> > > > +       struct klp_object *old_obj;
> > > > +       int err = 0;
> > > > +
> > > > +       if (WARN_ON(!patch->replace))
> > > > +               return -EINVAL;
> > > 
> > > IMO, this is another one of those overly paranoid warnings that isn't
> > > really needed.  Why would we call klp_add_nops() for a non-replace
> > > patch?
> > 
> > Just to be sure. What is the difference, for example, against the following
> > checks in __klp_enable_patch() from your point of view, please?
> > 
> >     if (klp_transition_patch)
> >             return -EBUSY;
> > 
> >     if (WARN_ON(patch->enabled))
> >             return -EINVAL;
> > 
> > One difference is that klp_enable_patch() is exported symbol. One the
> > other hand, livepatch code developers could do mistakes as well.
> > Adding nops sounds like an innoncent operation after all ;-)
> 
> But klp_enable_patch() being an exported symbol is an important
> difference.  It catches a patch author abusing the interface.  Which is
> much more likely than one of us accidentally calling klp_add_nops().
> Have you not noticed how thorough our code reviews are? ;-)
> 
> Anyway, I suppose it's a harmless check and I don't feel very strongly
> about it, it just seems unnecessary.


I have removed the check.


> > > > diff --git a/kernel/livepatch/transition.c 
> > > > b/kernel/livepatch/transition.c
> > > > index 6917100fbe79..d6af190865d2 100644
> > > > --- a/kernel/livepatch/transition.c
> > > > +++ b/kernel/livepatch/transition.c
> > > > @@ -87,6 +87,36 @@ static void klp_complete_transition(void)
> > > >                  klp_transition_patch->mod->name,
> > > >                  klp_target_state == KLP_PATCHED ? "patching" : 
> > > > "unpatching");
> > > >  
> > > > +       /*
> > > > +        * For replace patches, we disable all previous patches, and 
> > > > replace
> > > > +        * the dynamic no-op functions by removing the ftrace hook.
> > > > +        */
> > > > +       if (klp_transition_patch->replace && klp_target_state == 
> > > > KLP_PATCHED) {
> > > > +               /*
> > > > +                * Make sure that no ftrace handler accesses any older 
> > > > patch
> > > > +                * on the stack.  This might happen when the user 
> > > > forced the
> > > > +                * transaction while some running tasks were still 
> > > > falling
> > > > +                * back to the old code.  There might even still be 
> > > > ftrace
> > > > +                * handlers that have not seen the last patch on the 
> > > > stack yet.
> > > > +                *
> > > > +                * It probably is not necessary because of the rcu-safe 
> > > > access.
> > > > +                * But better be safe than sorry.
> > > > +                */
> > > > +               if (klp_forced)
> > > > +                       klp_synchronize_transition();
> > > 
> > > I don't like this.  Hopefully we can get just rid of it, if we also get
> > > rid of the concept of "throwing away" patches like I proposed.
> > 
> > What exactly you do not like about it, please?
> > 
> > It is not needed if all processes were migrated using the consistency
> > model, definitely.
> > 
> > If the transition has been forced then the barrier should be needed from
> > similar reasons as the barrier after klp_unpatch_objects() below.
> > We basically want to be sure what ftrace handlers see on the stack.
> > 
> > Will it help, when I remove the last paragraph where the formulation
> > is quite uncertain?
> 
> Well, the last paragraph doesn't inspire a lot of confidence ;-)  It
> sounds like voodoo.

Races are never easy area. You are right that I was not completely
confident and wanted to be on the safe side. Your questions helped
me to realize that the synchronization is not neeeded.


> Also the comment just seems very confusing to me:
> 
> - What specifically is it protecting against, e.g., _why_ should no
>   ftrace handler access any old patch on the stack, and when shouldn't
>   it do so?  Is the barrier needed before func->transition is cleared,
>   or what?

I was afraid of invalid memory accessed in klp_ftrace_handler().
I simply underestimated the power of RCU. I am not that familiar
with it. Also the following is a bit non-standard:

        func = list_entry_rcu(func->stack_node.next,
                              struct klp_func, stack_node);

Anyway, you made me to check it. All looks safe after all.

> - Does RCU make it safe, or doesn't it?  If yes, why is this needed?  If
>   no, why not?

Yes. I removed the synchronization. Instead, I explained the situation
in a comment above klp_discard_replaced_patches().


 
> > > > +
> > > > +               klp_throw_away_replaced_patches(klp_transition_patch,
> > > > +                                               klp_forced);
> > > > +
> > > > +               /*
> > > > +                * There is no need to synchronize the transition after 
> > > > removing
> > > > +                * nops. They must be the last on the func_stack. Ftrace
> > > > +                * gurantees that nobody will stay in the trampoline 
> > > > after
> > > 
> > > "guarantees"
> > > 
> > > > +                * the ftrace handler is unregistered.
> > > > +                */
> > > > +               klp_unpatch_objects(klp_transition_patch, KLP_FUNC_NOP);
> > > > +       }
> > > > +
> > > >         if (klp_target_state == KLP_UNPATCHED) {
> > > >                 /*
> > > >                  * All tasks have transitioned to KLP_UNPATCHED so we 
> > > > can now
> > > > @@ -143,6 +173,15 @@ static void klp_complete_transition(void)
> > > >         if (!klp_forced && klp_target_state == KLP_UNPATCHED)
> > > >                 module_put(klp_transition_patch->mod);
> > > >  
> > > > +       /*
> > > > +        * We do not need to wait until the objects are really freed.
> > > > +        * The patch must be on the bottom of the stack. Therefore it
> > > > +        * will never replace anything else. The only important thing
> > > > +        * is that we wait when the patch is being unregistered.
> > > > +        */
> > > > +       if (klp_transition_patch->replace && klp_target_state == 
> > > > KLP_PATCHED)
> > > > +               klp_free_objects(klp_transition_patch, KLP_FUNC_NOP);
> > > > +
> > > 
> > > This makes me a bit nervous.  What happens if the patch is enabled, then
> > > disabled, then enabled again?  Then klp_free_objects() wouldn't do
> > > anything, because the ops would already be freed.
> > 
> > They are not necessary when all replaced patches are removed from
> > the stack. There will be no livepatch if this one gets disabled.
> 
> My point was that if you enable, then disable, then enable again,
> klp_free_objects() will get called again, and it will do nothing the
> second time around.

> Maybe that's safe in this instance, but in general, it's easy to forget
> the re-enable case when adding special cases for 'patch->replace'.

Yes, it is safe.


> I get the feeling that it would be safer to just clear 'patch->replace'
> after this step to avoid such scenarios.  After all, when re-enabling a
> 'replace' patch, it's no longer replacing anything (assuming here that a
> replace patch will permanently disable all previous patches).

I am not completely comfortable with touching item that is set by the
author of the patch. On the other hand, I do not see how it could
harm. I have just added the line to disable the flag.

Best Regards,
Petr

Re: [PATCH v10 06/10] livepatch: Add atomic replace

Reply via email to