Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-11 Thread Josh Poimboeuf
On Wed, Jan 11, 2017 at 04:18:28PM +0100, Petr Mladek wrote:
> On Tue 2017-01-10 14:46:46, Josh Poimboeuf wrote:
> > On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > > > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > > > +   read_unlock(_lock);
> > > > > > > > +
> > > > > > > > +   /*
> > > > > > > > +* Ditto for the idle "swapper" tasks, though they 
> > > > > > > > never cross the
> > > > > > > > +* syscall barrier.  Instead they switch over in 
> > > > > > > > cpu_idle_loop().
> > > > > > > > +*/
> > > > > > > > +   get_online_cpus();
> > > > > > > > +   for_each_online_cpu(cpu)
> > > > > > > > +   set_tsk_thread_flag(idle_task(cpu), 
> > > > > > > > TIF_PATCH_PENDING);
> > > > > > > > +   put_online_cpus();
> > > > > > > 
> > > > > > > Also this stage need to be somehow handled by CPU coming/going
> > > > > > > handlers.
> > > > > > 
> > > > > > Here I think we could automatically switch any offline CPUs' idle 
> > > > > > tasks.
> > > > > > And something similar in klp_try_complete_transition().
> > > > > 
> > > > > We still need to make sure to do not race with the cpu_up()/cpu_down()
> > > > > calls.
> > > > 
> > > > Hm, maybe we'd need to call cpu_hotplug_disable() before switching the
> > > > offline idle tasks?
> > > > 
> > > > > I would use here the trick with for_each_possible_cpu() and let
> > > > > the migration for the stack check.
> > > > 
> > > > There are a few issues with that:
> > > > 
> > > > 1) The idle task of a missing CPU doesn't *have* a stack, so it doesn't
> > > >make much sense to try to check it.
> > > > 
> > > > 2) We can't rely *only* on the stack check, because not all arches have
> > > >it.  The other way to migrate idle tasks is from the idle loop switch
> > > >point.  But if the task's CPU is down, its idle loop isn't running so
> > > >it can't migrate.
> > > > 
> > > >(Note this is currently a theoretical point: we currently don't allow
> > > >such arches to use the consistency model anyway because there's no
> > > >way for them to migrate kthreads.)
> > > 
> > > Good points. My only concern is that the transaction might take a long
> > > or even forever. I am not sure if it is wise to disable cpu_hotplug
> > > for the entire transaction.
> > > 
> > > A compromise might be to disable cpu hotplug only when the task
> > > state is manipulated a more complex way. Hmm, cpu_hotplug_disable()
> > > looks like a rather costly function. We should not call it in
> > > klp_try_complete_transition(). But we could do:
> > > 
> > >   1. When the patch is being enabled, disable cpu hotplug,
> > >  go through each_possible_cpu and setup the transaction
> > >  only for CPUs that are online. Then we could enable
> > >  the hotplug again.
> > > 
> > >   2. Check only each_online_cpu in klp_try_complete_transition().
> > >  If all tasks are migrated, disable cpu hotplug and re-check
> > >  idle tasks on online CPUs. If any is not migrated, enable
> > >  hotplug and return failure. Othewise, continue with
> > >  completion of the transaction. [*]
> > > 
> > >   3. In klp_complete_transition, update all tasks including
> > >  the offline CPUs and enable cpu hotplug again.
> > > 
> > > If the re-check in the 2nd step looks ugly, we could add some hotlug
> > > notifiers to make sure that enabled/disabled CPUs are in a reasonable
> > > state. We still should disable the hotplug in the 1st and 3rd step.
> > > 
> > > BTW: There is a new API for the cpu hotplug callbacks. I was involved
> > > in one conversion. You might take inspiration in
> > > drivers/thermal/intel_powerclamp.c. See cpuhp_setup_state_nocalls()
> > > there.
> > 
> > Backing up a bit, although I brought up cpu_hotplug_disable(), I think I
> > misunderstood the race you mentioned.  I actually don't think
> > cpu_hotplug_disable() is necessary.
> 
> Great backing! You made me to study the difference. If I get it
> correctly:
> 
>   + cpu_hotplug_disable() works like a writer lock. It gets
> exclusive access via cpu_hotplug_begin(). A side effect
> is that do_cpu_up() and do_cpu_down() do not wait. They
> return -EBUSY if hotplug is disabled.
> 
>   + get_online_cpus() is kind of reader lock. It makes sure
> that all the hotplug operations are finished and "softly"
> blocks other further operation. By "softly" I mean that
> the operations wait for the exclusive (write) access
> in cpu_hotplug_begin().
> 
> IMHO, we really have to use get_online_cpus() and avoid the
> the "hard" blocking.
> 
> 
> > What do you think about something like the following:
>  
> > In klp_start_transition:
> > 
> > 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-11 Thread Petr Mladek
On Tue 2017-01-10 14:46:46, Josh Poimboeuf wrote:
> On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > > + read_unlock(_lock);
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * Ditto for the idle "swapper" tasks, though they never cross 
> > > > > > > the
> > > > > > > +  * syscall barrier.  Instead they switch over in 
> > > > > > > cpu_idle_loop().
> > > > > > > +  */
> > > > > > > + get_online_cpus();
> > > > > > > + for_each_online_cpu(cpu)
> > > > > > > + set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> > > > > > > + put_online_cpus();
> > > > > > 
> > > > > > Also this stage need to be somehow handled by CPU coming/going
> > > > > > handlers.
> > > > > 
> > > > > Here I think we could automatically switch any offline CPUs' idle 
> > > > > tasks.
> > > > > And something similar in klp_try_complete_transition().
> > > > 
> > > > We still need to make sure to do not race with the cpu_up()/cpu_down()
> > > > calls.
> > > 
> > > Hm, maybe we'd need to call cpu_hotplug_disable() before switching the
> > > offline idle tasks?
> > > 
> > > > I would use here the trick with for_each_possible_cpu() and let
> > > > the migration for the stack check.
> > > 
> > > There are a few issues with that:
> > > 
> > > 1) The idle task of a missing CPU doesn't *have* a stack, so it doesn't
> > >make much sense to try to check it.
> > > 
> > > 2) We can't rely *only* on the stack check, because not all arches have
> > >it.  The other way to migrate idle tasks is from the idle loop switch
> > >point.  But if the task's CPU is down, its idle loop isn't running so
> > >it can't migrate.
> > > 
> > >(Note this is currently a theoretical point: we currently don't allow
> > >such arches to use the consistency model anyway because there's no
> > >way for them to migrate kthreads.)
> > 
> > Good points. My only concern is that the transaction might take a long
> > or even forever. I am not sure if it is wise to disable cpu_hotplug
> > for the entire transaction.
> > 
> > A compromise might be to disable cpu hotplug only when the task
> > state is manipulated a more complex way. Hmm, cpu_hotplug_disable()
> > looks like a rather costly function. We should not call it in
> > klp_try_complete_transition(). But we could do:
> > 
> >   1. When the patch is being enabled, disable cpu hotplug,
> >  go through each_possible_cpu and setup the transaction
> >  only for CPUs that are online. Then we could enable
> >  the hotplug again.
> > 
> >   2. Check only each_online_cpu in klp_try_complete_transition().
> >  If all tasks are migrated, disable cpu hotplug and re-check
> >  idle tasks on online CPUs. If any is not migrated, enable
> >  hotplug and return failure. Othewise, continue with
> >  completion of the transaction. [*]
> > 
> >   3. In klp_complete_transition, update all tasks including
> >  the offline CPUs and enable cpu hotplug again.
> > 
> > If the re-check in the 2nd step looks ugly, we could add some hotlug
> > notifiers to make sure that enabled/disabled CPUs are in a reasonable
> > state. We still should disable the hotplug in the 1st and 3rd step.
> > 
> > BTW: There is a new API for the cpu hotplug callbacks. I was involved
> > in one conversion. You might take inspiration in
> > drivers/thermal/intel_powerclamp.c. See cpuhp_setup_state_nocalls()
> > there.
> 
> Backing up a bit, although I brought up cpu_hotplug_disable(), I think I
> misunderstood the race you mentioned.  I actually don't think
> cpu_hotplug_disable() is necessary.

Great backing! You made me to study the difference. If I get it
correctly:

  + cpu_hotplug_disable() works like a writer lock. It gets
exclusive access via cpu_hotplug_begin(). A side effect
is that do_cpu_up() and do_cpu_down() do not wait. They
return -EBUSY if hotplug is disabled.

  + get_online_cpus() is kind of reader lock. It makes sure
that all the hotplug operations are finished and "softly"
blocks other further operation. By "softly" I mean that
the operations wait for the exclusive (write) access
in cpu_hotplug_begin().

IMHO, we really have to use get_online_cpus() and avoid the
the "hard" blocking.


> What do you think about something like the following:
 
> In klp_start_transition:
> 
>   get_online_cpus();
>   for_each_possible_cpu(cpu)
>   set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
>   put_online_cpus();
>
> In klp_try_complete_transition:
> 
>   get_online_cpus();
>   for_each_possible_cpu(cpu) {
>   task = idle_task(cpu);
>   if (cpu_online(cpu)) {
>  

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Josh Poimboeuf
On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > Change livepatch to use a basic per-task consistency model.  This 
> > > > > > is the
> > > > > > foundation which will eventually enable us to patch those ~10% of
> > > > > > security patches which change function or data semantics.  This is 
> > > > > > the
> > > > > > biggest remaining piece needed to make livepatch more generally 
> > > > > > useful.
> > > > > > 
> > > > > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > > > > 
> > > > > > --- /dev/null
> > > > > > +++ b/kernel/livepatch/transition.c
> > > > > > +   /*
> > > > > > +* Enforce the order of the task->patch_state initializations 
> > > > > > and the
> > > > > > +* func->transition updates to ensure that, in the enable path,
> > > > > > +* klp_ftrace_handler() doesn't see a func in transition with a
> > > > > > +* task->patch_state of KLP_UNDEFINED.
> > > > > > +*/
> > > > > > +   smp_wmb();
> > > > > > +
> > > > > > +   /*
> > > > > > +* Set the func transition states so klp_ftrace_handler() will 
> > > > > > know to
> > > > > > +* switch to the transition logic.
> > > > > > +*
> > > > > > +* When patching, the funcs aren't yet in the func_stack and 
> > > > > > will be
> > > > > > +* made visible to the ftrace handler shortly by the calls to
> > > > > > +* klp_patch_object().
> > > > > > +*
> > > > > > +* When unpatching, the funcs are already in the func_stack and 
> > > > > > so are
> > > > > > +* already visible to the ftrace handler.
> > > > > > +*/
> > > > > > +   klp_for_each_object(patch, obj)
> > > > > > +   klp_for_each_func(obj, func)
> > > > > > +   func->transition = true;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * Start the transition to the specified target patch state so 
> > > > > > tasks can begin
> > > > > > + * switching to it.
> > > > > > + */
> > > > > > +void klp_start_transition(void)
> > > > > > +{
> > > > > > +   struct task_struct *g, *task;
> > > > > > +   unsigned int cpu;
> > > > > > +
> > > > > > +   WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > > > > > +
> > > > > > +   pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > > > > > + klp_target_state == KLP_PATCHED ? "patching" : 
> > > > > > "unpatching");
> > > > > > +
> > > > > > +   /*
> > > > > > +* If the patch can be applied or reverted immediately, skip the
> > > > > > +* per-task transitions.
> > > > > > +*/
> > > > > > +   if (klp_transition_patch->immediate)
> > > > > > +   return;
> > > > > > +
> > > > > > +   /*
> > > > > > +* Mark all normal tasks as needing a patch state update.  As 
> > > > > > they pass
> > > > > > +* through the syscall barrier they'll switch over to the 
> > > > > > target state
> > > > > > +* (unless we switch them in klp_try_complete_transition() 
> > > > > > first).
> > > > > > +*/
> > > > > > +   read_lock(_lock);
> > > > > > +   for_each_process_thread(g, task)
> > > > > > +   set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > > 
> > > > > This is called also from klp_reverse_transition(). We should set it
> > > > > only when the task need migration. Also we should clear it when
> > > > > the task is in the right state already.
> > > > > 
> > > > > It is not only optimization. It actually solves a race between
> > > > > klp_complete_transition() and klp_update_patch_state(), see below.
> > > > 
> > > > I agree about the race, but if I did:
> > > > 
> > > > for_each_process_thread(g, task) {
> > > > if (task->patch_state != klp_target_state)
> > > > set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > else
> > > > clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > }
> > > > 
> > > > It would still leave a small window where TIF_PATCH_PENDING gets set for
> > > > an already patched task, if klp_update_patch_state() is running at the
> > > > same time.
> > > 
> > > I see your point. Well, it seems that it is more complicated:
> > > 
> > > The race would be possible only when this was called from
> > > klp_reverse_transition(). But we need to call there
> > > rcu_synchronize() to prevent races with klp_update_patch_state()
> > > also to prevent prelimitary patch completion.
> > > 
> > > The result is:
> > > 
> > >   if (task->patch_state != klp_target_state) {
> > >   # it means that the task was already migrated before
> > >   # we reverted klp_target_state. It means that
> > >   # the TIF flag was already cleared, the related
> > >   # 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Petr Mladek
On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > Change livepatch to use a basic per-task consistency model.  This is 
> > > > > the
> > > > > foundation which will eventually enable us to patch those ~10% of
> > > > > security patches which change function or data semantics.  This is the
> > > > > biggest remaining piece needed to make livepatch more generally 
> > > > > useful.
> > > > > 
> > > > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > > > 
> > > > > --- /dev/null
> > > > > +++ b/kernel/livepatch/transition.c
> > > > > + /*
> > > > > +  * Enforce the order of the task->patch_state initializations 
> > > > > and the
> > > > > +  * func->transition updates to ensure that, in the enable path,
> > > > > +  * klp_ftrace_handler() doesn't see a func in transition with a
> > > > > +  * task->patch_state of KLP_UNDEFINED.
> > > > > +  */
> > > > > + smp_wmb();
> > > > > +
> > > > > + /*
> > > > > +  * Set the func transition states so klp_ftrace_handler() will 
> > > > > know to
> > > > > +  * switch to the transition logic.
> > > > > +  *
> > > > > +  * When patching, the funcs aren't yet in the func_stack and 
> > > > > will be
> > > > > +  * made visible to the ftrace handler shortly by the calls to
> > > > > +  * klp_patch_object().
> > > > > +  *
> > > > > +  * When unpatching, the funcs are already in the func_stack and 
> > > > > so are
> > > > > +  * already visible to the ftrace handler.
> > > > > +  */
> > > > > + klp_for_each_object(patch, obj)
> > > > > + klp_for_each_func(obj, func)
> > > > > + func->transition = true;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Start the transition to the specified target patch state so tasks 
> > > > > can begin
> > > > > + * switching to it.
> > > > > + */
> > > > > +void klp_start_transition(void)
> > > > > +{
> > > > > + struct task_struct *g, *task;
> > > > > + unsigned int cpu;
> > > > > +
> > > > > + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > > > > +
> > > > > + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > > > > +   klp_target_state == KLP_PATCHED ? "patching" : 
> > > > > "unpatching");
> > > > > +
> > > > > + /*
> > > > > +  * If the patch can be applied or reverted immediately, skip the
> > > > > +  * per-task transitions.
> > > > > +  */
> > > > > + if (klp_transition_patch->immediate)
> > > > > + return;
> > > > > +
> > > > > + /*
> > > > > +  * Mark all normal tasks as needing a patch state update.  As 
> > > > > they pass
> > > > > +  * through the syscall barrier they'll switch over to the 
> > > > > target state
> > > > > +  * (unless we switch them in klp_try_complete_transition() 
> > > > > first).
> > > > > +  */
> > > > > + read_lock(_lock);
> > > > > + for_each_process_thread(g, task)
> > > > > + set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > 
> > > > This is called also from klp_reverse_transition(). We should set it
> > > > only when the task need migration. Also we should clear it when
> > > > the task is in the right state already.
> > > > 
> > > > It is not only optimization. It actually solves a race between
> > > > klp_complete_transition() and klp_update_patch_state(), see below.
> > > 
> > > I agree about the race, but if I did:
> > > 
> > >   for_each_process_thread(g, task) {
> > >   if (task->patch_state != klp_target_state)
> > >   set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > >   else
> > >   clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > >   }
> > > 
> > > It would still leave a small window where TIF_PATCH_PENDING gets set for
> > > an already patched task, if klp_update_patch_state() is running at the
> > > same time.
> > 
> > I see your point. Well, it seems that it is more complicated:
> > 
> > The race would be possible only when this was called from
> > klp_reverse_transition(). But we need to call there
> > rcu_synchronize() to prevent races with klp_update_patch_state()
> > also to prevent prelimitary patch completion.
> > 
> > The result is:
> > 
> > if (task->patch_state != klp_target_state) {
> > # it means that the task was already migrated before
> > # we reverted klp_target_state. It means that
> > # the TIF flag was already cleared, the related
> > # klp_update_patch_state() already finished (thanks
> > # to rcu_synchronize() and new one will be called
> > # only when we set the flag again
> > # => it is safe to set it
> > 
> > # we should also check and warn when 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Miroslav Benes

> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -9,6 +9,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  
> > > @@ -264,6 +265,9 @@ static void do_idle(void)
> > >  
> > >   sched_ttwu_pending();
> > >   schedule_preempt_disabled();
> > > +
> > > + if (unlikely(klp_patch_pending(current)))
> > > + klp_update_patch_state(current);
> > >  }
> > 
> > I think that (theoretically) this is not sufficient, if we patch a 
> > function present on an idle task's stack and one of the two following 
> > scenarios happen.
> 
> Agreed, though I'd argue that these are rare edge cases which can
> probably be refined later, outside the scope of this patch set.

You're right. They should be really rare and we can solve them (if we even 
want to) later.
 
> > 1. there is nothing to schedule on a cpu and an idle task does not leave a 
> > loop in do_idle() for some time. It may be a nonsense practically and if 
> > it is not we could solve with schedule_on_each_cpu() on an empty stub 
> > somewhere in our code.
> 
> This might only be a theoretical issue, as it only happens when patching
> one of the idle functions themselves.
> 
> If we decided that this were a real world problem, we could use
> something like schedule_on_each_cpu() to flush them out as you
> suggested.  Or it could even be done from user space by briefly running
> a CPU-intensive program on the affected CPUs.

Yes.

> > 2. there is a cpu-bound process running on one of the cpus. No chance of 
> > going to do_idle() there at all and the idle task would block the 
> > patching.
> 
> To clarify I think this would only be an issue when trying to patch idle
> code or schedule()/__schedule().

Yes.

> > We ran into it in kGraft and I tried to solve it with this new 
> > hunk in pick_next_task()...
> > 
> > +   /*
> > +* Patching is in progress, schedule an idle task to migrate it
> > +*/
> > +   if (kgr_in_progress_branch()) {
> > +   if (!test_bit(0, kgr_immutable) &&
> > +   klp_kgraft_task_in_progress(rq->idle)) {
> > +   p = idle_sched_class.pick_next_task(rq, prev);
> > +
> > +   return p;
> > +   }
> > +   }
> > 
> > (kgr_in_progress_branch() is a static key basically. kgr_immutable flag 
> > solves something we don't have a problem with in upstream livepatch thanks 
> > to a combination of task->patch_state and klp_func->transition. 
> > klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)
> > 
> > It is not tested properly and it is a hack as hell so take it as that. 
> > Also note that the problem in kGraft is more serious as we don't have a 
> > stack checking there. So any livepatch could cause the issue easily.
> > 
> > I can imagine even crazier solutions but nothing nice and pretty (which is 
> > probably impossible because the whole idea to deliberately schedule an 
> > idle task is not nice and pretty).
> 
> Yeah, that looks hairy...
> 
> Since this is such a specialized case (patching the scheduler in an idle
> task while CPU-intensive tasks are running) this might also be more
> reasonably accomplished from user space by briefly SIGSTOPing the CPU
> hog.

Yes, that is true. I can imagine there are users who don't want to stop 
the cpu hog at all. Even for a fraction of time. HPC comes to mind. But it 
is not worth it to solve it with something like the code above. Let's add 
it to the very end of our TODO lists :). 

Miroslav


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Petr Mladek
On Fri 2017-01-06 14:07:34, Josh Poimboeuf wrote:
> On Fri, Dec 23, 2016 at 11:18:03AM +0100, Petr Mladek wrote:
> > On Fri 2016-12-23 10:24:35, Miroslav Benes wrote:
> > > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > > index 5efa262..e79ebb5 100644
> > > > > > --- a/kernel/livepatch/patch.c
> > > > > > +++ b/kernel/livepatch/patch.c
> > > > > > @@ -29,6 +29,7 @@
> > > > > >  #include 
> > > > > >  #include 
> > > > > >  #include "patch.h"
> > > > > > +#include "transition.h"
> > > > > >  
> > > > > >  static LIST_HEAD(klp_ops);
> > > > > >  
> > > > > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > > > long ip,
> > > > > >  {
> > > > > > struct klp_ops *ops;
> > > > > > struct klp_func *func;
> > > > > > +   int patch_state;
> > > > > >  
> > > > > > ops = container_of(fops, struct klp_ops, fops);
> > > > > >  
> > > > > > rcu_read_lock();
> > > > > > +
> > > > > > func = list_first_or_null_rcu(>func_stack, struct klp_func,
> > > > > >   stack_node);
> > > > > > -   if (WARN_ON_ONCE(!func))
> > > > > > +
> > > > > > +   if (!func)
> > > > > > goto unlock;
> > > > > 
 
> Yeah, I'm thinking we should keep the warning to catch any bugs in case
> any of our ftrace assumptions change.  Maybe I should add a comment:
> 
>   /*
>* func can never be NULL because preemption should be disabled
>* here and unregister_ftrace_function() does the equivalent of
>* a synchronize_sched() before the func_stack removal.
>*/
>   if (WARN_ON_ONCE(!func))
>   goto unlock;

Sounds reasonable to me.

Best Regards,
Petr


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-06 Thread Josh Poimboeuf
On Thu, Jan 05, 2017 at 10:34:57AM +0100, Miroslav Benes wrote:
> 
> > @@ -740,6 +809,14 @@ int klp_register_patch(struct klp_patch *patch)
> > return -ENODEV;
> >  
> > /*
> > +* Architectures without reliable stack traces have to set
> > +* patch->immediate because there's currently no way to patch kthreads
> > +* with the consistency model.
> > +*/
> > +   if (!klp_have_reliable_stack() && !patch->immediate)
> > +   return -ENOSYS;
> > +
> 
> I think an error message (pr_err) would be appropriate here. 
> 
> $ insmod patch_1.ko
> insmod: ERROR: could not insert module patch_1.ko: Function not implemented
> 
> is not helpful much :)

Ok :-)

-- 
Josh


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-06 Thread Josh Poimboeuf
On Wed, Jan 04, 2017 at 02:44:47PM +0100, Miroslav Benes wrote:
> On Thu, 8 Dec 2016, Josh Poimboeuf wrote:
> 
> > +void klp_start_transition(void)
> > +{
> > +   struct task_struct *g, *task;
> > +   unsigned int cpu;
> > +
> > +   WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > +
> > +   pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > + klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > +
> > +   /*
> > +* If the patch can be applied or reverted immediately, skip the
> > +* per-task transitions.
> > +*/
> > +   if (klp_transition_patch->immediate)
> > +   return;
> > +
> > +   /*
> > +* Mark all normal tasks as needing a patch state update.  As they pass
> > +* through the syscall barrier they'll switch over to the target state
> > +* (unless we switch them in klp_try_complete_transition() first).
> > +*/
> > +   read_lock(_lock);
> > +   for_each_process_thread(g, task)
> > +   set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > +   read_unlock(_lock);
> > +
> > +   /*
> > +* Ditto for the idle "swapper" tasks, though they never cross the
> > +* syscall barrier.  Instead they switch over in cpu_idle_loop().
> 
> ...or we switch them in klp_try_complete_transition() first by looking at 
> their stacks, right? I would add it to the comment.

Yeah, I guess the "ditto" was intended to include the "unless we switch
them in klp_try_complete_transition() first" statement from the previous
comment.  I'll try to make it clearer.

> > +*/
> > +   get_online_cpus();
> > +   for_each_online_cpu(cpu)
> > +   set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> > +   put_online_cpus();
> > +}
> 
> [...]
> 
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -9,6 +9,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  
> > @@ -264,6 +265,9 @@ static void do_idle(void)
> >  
> > sched_ttwu_pending();
> > schedule_preempt_disabled();
> > +
> > +   if (unlikely(klp_patch_pending(current)))
> > +   klp_update_patch_state(current);
> >  }
> 
> I think that (theoretically) this is not sufficient, if we patch a 
> function present on an idle task's stack and one of the two following 
> scenarios happen.

Agreed, though I'd argue that these are rare edge cases which can
probably be refined later, outside the scope of this patch set.

> 1. there is nothing to schedule on a cpu and an idle task does not leave a 
> loop in do_idle() for some time. It may be a nonsense practically and if 
> it is not we could solve with schedule_on_each_cpu() on an empty stub 
> somewhere in our code.

This might only be a theoretical issue, as it only happens when patching
one of the idle functions themselves.

If we decided that this were a real world problem, we could use
something like schedule_on_each_cpu() to flush them out as you
suggested.  Or it could even be done from user space by briefly running
a CPU-intensive program on the affected CPUs.

> 2. there is a cpu-bound process running on one of the cpus. No chance of 
> going to do_idle() there at all and the idle task would block the 
> patching.

To clarify I think this would only be an issue when trying to patch idle
code or schedule()/__schedule().

> We ran into it in kGraft and I tried to solve it with this new 
> hunk in pick_next_task()...
> 
> +   /*
> +* Patching is in progress, schedule an idle task to migrate it
> +*/
> +   if (kgr_in_progress_branch()) {
> +   if (!test_bit(0, kgr_immutable) &&
> +   klp_kgraft_task_in_progress(rq->idle)) {
> +   p = idle_sched_class.pick_next_task(rq, prev);
> +
> +   return p;
> +   }
> +   }
> 
> (kgr_in_progress_branch() is a static key basically. kgr_immutable flag 
> solves something we don't have a problem with in upstream livepatch thanks 
> to a combination of task->patch_state and klp_func->transition. 
> klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)
> 
> It is not tested properly and it is a hack as hell so take it as that. 
> Also note that the problem in kGraft is more serious as we don't have a 
> stack checking there. So any livepatch could cause the issue easily.
> 
> I can imagine even crazier solutions but nothing nice and pretty (which is 
> probably impossible because the whole idea to deliberately schedule an 
> idle task is not nice and pretty).

Yeah, that looks hairy...

Since this is such a specialized case (patching the scheduler in an idle
task while CPU-intensive tasks are running) this might also be more
reasonably accomplished from user space by briefly SIGSTOPing the CPU
hog.

> Otherwise the patch looks good to me. I don't understand how Petr found 
> those races there.

Agreed, kudos to Petr :-)

-- 
Josh


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-06 Thread Josh Poimboeuf
On Fri, Dec 23, 2016 at 11:18:03AM +0100, Petr Mladek wrote:
> On Fri 2016-12-23 10:24:35, Miroslav Benes wrote:
> > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > index 5efa262..e79ebb5 100644
> > > > > --- a/kernel/livepatch/patch.c
> > > > > +++ b/kernel/livepatch/patch.c
> > > > > @@ -29,6 +29,7 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include "patch.h"
> > > > > +#include "transition.h"
> > > > >  
> > > > >  static LIST_HEAD(klp_ops);
> > > > >  
> > > > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > > long ip,
> > > > >  {
> > > > >   struct klp_ops *ops;
> > > > >   struct klp_func *func;
> > > > > + int patch_state;
> > > > >  
> > > > >   ops = container_of(fops, struct klp_ops, fops);
> > > > >  
> > > > >   rcu_read_lock();
> > > > > +
> > > > >   func = list_first_or_null_rcu(>func_stack, struct klp_func,
> > > > > stack_node);
> > > > > - if (WARN_ON_ONCE(!func))
> > > > > +
> > > > > + if (!func)
> > > > >   goto unlock;
> > > > 
> > > > Why do you removed the WARN_ON_ONCE(), please?
> > > > 
> > > > We still add the function on the stack before registering
> > > > the ftrace handler. Also we unregister the ftrace handler
> > > > before removing the the last entry from the stack.
> > > > 
> > > > AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> > > > to make sure that no-one is inside the handler once finished.
> > > > Mirek knows more about it.
> > > 
> > > Hm, this is news to me.  Mirek, please share :-)
> > 
> > Well, I think the whole thing is well described in emails I exchanged with 
> > Steven few months ago. See [1].
> > 
> > [1] 
> > http://lkml.kernel.org/r/alpine.lnx.2.00.1608081041060.10...@pobox.suse.cz
> >  
> > > > If this is not true, we have a problem. For example,
> > > > we call kfree(ops) after unregister_ftrace_function();
> > > 
> > > Agreed.
> > 
> > TL;DR - we should be ok as long as we do not do crazy things in the 
> > handler, deliberate sleeping for example.
> > 
> > WARN_ON_ONCE() may be crazy too. I think we discussed it long ago and we 
> > came to an agreement to remove it.
> 
> There are definitely situations where this might hurt. For example,
> when we redirect a function called under logbuf_lock.
> 
> On the other hand, there is a work in progress[1][2] that will mitigate
> this risk a lot. Also this warning would be printed only when
> something goes wrong. IMHO, it is worth the risk. It will succeed
> in 99,999% cases and it might save us some headache when debugging
> random crashes of the system.
> 
> Anyway, if there is a reason to remove the warning, it should be
> described. And if it is not strictly related to this patch, it should
> be handled separately.
> 
> [1] 
> https://lkml.kernel.org/r/20161221143605.2272-1-sergey.senozhat...@gmail.com
> [2] 
> https://lkml.kernel.org/r/1461333180-2897-1-git-send-email-sergey.senozhat...@gmail.com

Yeah, I'm thinking we should keep the warning to catch any bugs in case
any of our ftrace assumptions change.  Maybe I should add a comment:

/*
 * func can never be NULL because preemption should be disabled
 * here and unregister_ftrace_function() does the equivalent of
 * a synchronize_sched() before the func_stack removal.
 */
if (WARN_ON_ONCE(!func))
goto unlock;

-- 
Josh


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-05 Thread Miroslav Benes

> @@ -740,6 +809,14 @@ int klp_register_patch(struct klp_patch *patch)
>   return -ENODEV;
>  
>   /*
> +  * Architectures without reliable stack traces have to set
> +  * patch->immediate because there's currently no way to patch kthreads
> +  * with the consistency model.
> +  */
> + if (!klp_have_reliable_stack() && !patch->immediate)
> + return -ENOSYS;
> +

I think an error message (pr_err) would be appropriate here. 

$ insmod patch_1.ko
insmod: ERROR: could not insert module patch_1.ko: Function not implemented

is not helpful much :)

Miroslav


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-04 Thread Miroslav Benes
On Thu, 8 Dec 2016, Josh Poimboeuf wrote:

> +void klp_start_transition(void)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> +
> + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +   klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (klp_transition_patch->immediate)
> + return;
> +
> + /*
> +  * Mark all normal tasks as needing a patch state update.  As they pass
> +  * through the syscall barrier they'll switch over to the target state
> +  * (unless we switch them in klp_try_complete_transition() first).
> +  */
> + read_lock(_lock);
> + for_each_process_thread(g, task)
> + set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> + read_unlock(_lock);
> +
> + /*
> +  * Ditto for the idle "swapper" tasks, though they never cross the
> +  * syscall barrier.  Instead they switch over in cpu_idle_loop().

...or we switch them in klp_try_complete_transition() first by looking at 
their stacks, right? I would add it to the comment.

> +  */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> + put_online_cpus();
> +}

[...]

> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -264,6 +265,9 @@ static void do_idle(void)
>  
>   sched_ttwu_pending();
>   schedule_preempt_disabled();
> +
> + if (unlikely(klp_patch_pending(current)))
> + klp_update_patch_state(current);
>  }

I think that (theoretically) this is not sufficient, if we patch a 
function present on an idle task's stack and one of the two following 
scenarios happen.

1. there is nothing to schedule on a cpu and an idle task does not leave a 
loop in do_idle() for some time. It may be a nonsense practically and if 
it is not we could solve with schedule_on_each_cpu() on an empty stub 
somewhere in our code.

2. there is a cpu-bound process running on one of the cpus. No chance of 
going to do_idle() there at all and the idle task would block the 
patching. We ran into it in kGraft and I tried to solve it with this new 
hunk in pick_next_task()...

+   /*
+* Patching is in progress, schedule an idle task to migrate it
+*/
+   if (kgr_in_progress_branch()) {
+   if (!test_bit(0, kgr_immutable) &&
+   klp_kgraft_task_in_progress(rq->idle)) {
+   p = idle_sched_class.pick_next_task(rq, prev);
+
+   return p;
+   }
+   }

(kgr_in_progress_branch() is a static key basically. kgr_immutable flag 
solves something we don't have a problem with in upstream livepatch thanks 
to a combination of task->patch_state and klp_func->transition. 
klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)

It is not tested properly and it is a hack as hell so take it as that. 
Also note that the problem in kGraft is more serious as we don't have a 
stack checking there. So any livepatch could cause the issue easily.

I can imagine even crazier solutions but nothing nice and pretty (which is 
probably impossible because the whole idea to deliberately schedule an 
idle task is not nice and pretty).

Otherwise the patch looks good to me. I don't understand how Petr found 
those races there.

Regards,
Miroslav


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-23 Thread Petr Mladek
On Fri 2016-12-23 10:24:35, Miroslav Benes wrote:
> > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > index 5efa262..e79ebb5 100644
> > > > --- a/kernel/livepatch/patch.c
> > > > +++ b/kernel/livepatch/patch.c
> > > > @@ -29,6 +29,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include "patch.h"
> > > > +#include "transition.h"
> > > >  
> > > >  static LIST_HEAD(klp_ops);
> > > >  
> > > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > long ip,
> > > >  {
> > > > struct klp_ops *ops;
> > > > struct klp_func *func;
> > > > +   int patch_state;
> > > >  
> > > > ops = container_of(fops, struct klp_ops, fops);
> > > >  
> > > > rcu_read_lock();
> > > > +
> > > > func = list_first_or_null_rcu(>func_stack, struct klp_func,
> > > >   stack_node);
> > > > -   if (WARN_ON_ONCE(!func))
> > > > +
> > > > +   if (!func)
> > > > goto unlock;
> > > 
> > > Why do you removed the WARN_ON_ONCE(), please?
> > > 
> > > We still add the function on the stack before registering
> > > the ftrace handler. Also we unregister the ftrace handler
> > > before removing the the last entry from the stack.
> > > 
> > > AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> > > to make sure that no-one is inside the handler once finished.
> > > Mirek knows more about it.
> > 
> > Hm, this is news to me.  Mirek, please share :-)
> 
> Well, I think the whole thing is well described in emails I exchanged with 
> Steven few months ago. See [1].
> 
> [1] http://lkml.kernel.org/r/alpine.lnx.2.00.1608081041060.10...@pobox.suse.cz
>  
> > > If this is not true, we have a problem. For example,
> > > we call kfree(ops) after unregister_ftrace_function();
> > 
> > Agreed.
> 
> TL;DR - we should be ok as long as we do not do crazy things in the 
> handler, deliberate sleeping for example.
> 
> WARN_ON_ONCE() may be crazy too. I think we discussed it long ago and we 
> came to an agreement to remove it.

There are definitely situations where this might hurt. For example,
when we redirect a function called under logbuf_lock.

On the other hand, there is a work in progress[1][2] that will mitigate
this risk a lot. Also this warning would be printed only when
something goes wrong. IMHO, it is worth the risk. It will succeed
in 99,999% cases and it might save us some headache when debugging
random crashes of the system.

Anyway, if there is a reason to remove the warning, it should be
described. And if it is not strictly related to this patch, it should
be handled separately.

[1] https://lkml.kernel.org/r/20161221143605.2272-1-sergey.senozhat...@gmail.com
[2] 
https://lkml.kernel.org/r/1461333180-2897-1-git-send-email-sergey.senozhat...@gmail.com

Best Regards,
Petr


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-23 Thread Miroslav Benes
> > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > index 5efa262..e79ebb5 100644
> > > --- a/kernel/livepatch/patch.c
> > > +++ b/kernel/livepatch/patch.c
> > > @@ -29,6 +29,7 @@
> > >  #include 
> > >  #include 
> > >  #include "patch.h"
> > > +#include "transition.h"
> > >  
> > >  static LIST_HEAD(klp_ops);
> > >  
> > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long 
> > > ip,
> > >  {
> > >   struct klp_ops *ops;
> > >   struct klp_func *func;
> > > + int patch_state;
> > >  
> > >   ops = container_of(fops, struct klp_ops, fops);
> > >  
> > >   rcu_read_lock();
> > > +
> > >   func = list_first_or_null_rcu(>func_stack, struct klp_func,
> > > stack_node);
> > > - if (WARN_ON_ONCE(!func))
> > > +
> > > + if (!func)
> > >   goto unlock;
> > 
> > Why do you removed the WARN_ON_ONCE(), please?
> > 
> > We still add the function on the stack before registering
> > the ftrace handler. Also we unregister the ftrace handler
> > before removing the the last entry from the stack.
> > 
> > AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> > to make sure that no-one is inside the handler once finished.
> > Mirek knows more about it.
> 
> Hm, this is news to me.  Mirek, please share :-)

Well, I think the whole thing is well described in emails I exchanged with 
Steven few months ago. See [1].

[1] http://lkml.kernel.org/r/alpine.lnx.2.00.1608081041060.10...@pobox.suse.cz
 
> > If this is not true, we have a problem. For example,
> > we call kfree(ops) after unregister_ftrace_function();
> 
> Agreed.

TL;DR - we should be ok as long as we do not do crazy things in the 
handler, deliberate sleeping for example.

WARN_ON_ONCE() may be crazy too. I think we discussed it long ago and we 
came to an agreement to remove it.

Miroslav, very slowly going through the patch set


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-22 Thread Josh Poimboeuf
On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > Change livepatch to use a basic per-task consistency model.  This is the
> > > > foundation which will eventually enable us to patch those ~10% of
> > > > security patches which change function or data semantics.  This is the
> > > > biggest remaining piece needed to make livepatch more generally useful.
> > > > 
> > > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > > 
> > > > --- /dev/null
> > > > +++ b/kernel/livepatch/transition.c
> > > > +/*
> > > > + * Initialize the global target patch state and all tasks to the 
> > > > initial patch
> > > > + * state, and initialize all function transition states to true in 
> > > > preparation
> > > > + * for patching or unpatching.
> > > > + */
> > > > +void klp_init_transition(struct klp_patch *patch, int state)
> > > > +{
> > > > +   struct task_struct *g, *task;
> > > > +   unsigned int cpu;
> > > > +   struct klp_object *obj;
> > > > +   struct klp_func *func;
> > > > +   int initial_state = !state;
> > > > +
> > > > +   WARN_ON_ONCE(klp_target_state != KLP_UNDEFINED);
> > > > +
> > > > +   klp_transition_patch = patch;
> > > > +
> > > > +   /*
> > > > +* Set the global target patch state which tasks will switch 
> > > > to.  This
> > > > +* has no effect until the TIF_PATCH_PENDING flags get set 
> > > > later.
> > > > +*/
> > > > +   klp_target_state = state;
> > > > +
> > > > +   /*
> > > > +* If the patch can be applied or reverted immediately, skip the
> > > > +* per-task transitions.
> > > > +*/
> > > > +   if (patch->immediate)
> > > > +   return;
> > > > +
> > > > +   /*
> > > > +* Initialize all tasks to the initial patch state to prepare 
> > > > them for
> > > > +* switching to the target state.
> > > > +*/
> > > > +   read_lock(_lock);
> > > > +   for_each_process_thread(g, task) {
> > > > +   WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > > +   task->patch_state = initial_state;
> > > > +   }
> > > > +   read_unlock(_lock);
> > > > +
> > > > +   /*
> > > > +* Ditto for the idle "swapper" tasks.
> > > > +*/
> > > > +   get_online_cpus();
> > > > +   for_each_online_cpu(cpu) {
> > > > +   task = idle_task(cpu);
> > > > +   WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > > +   task->patch_state = initial_state;
> > > > +   }
> > > > +   put_online_cpus();
> > > 
> > > We allow to add/remove CPUs here. I am afraid that we will also need
> > > to add a cpu coming/going handler that will set the task->patch_state
> > > the right way. We must not set the klp_target_state until all ftrace
> > > handlers are ready.
> > 
> > What if we instead just change the above to use for_each_possible_cpu()?
> > We could do the same in klp_complete_transition().
> 
> I like this idea. It seems that there is idle task for each possible
> cpu, see idle_threads_init().
> 
> IMHO, we should do the same everytime we do anything with the idle
> tasks. I mean in klp_start_transition, klp_try_complete_transition()
> and also complete_transition().
> 
> Then they will be handled like any other processes and we do not need
> to think of any special races.

More on this below.

> > > > +   /*
> > > > +* Enforce the order of the task->patch_state initializations 
> > > > and the
> > > > +* func->transition updates to ensure that, in the enable path,
> > > > +* klp_ftrace_handler() doesn't see a func in transition with a
> > > > +* task->patch_state of KLP_UNDEFINED.
> > > > +*/
> > > > +   smp_wmb();
> > > > +
> > > > +   /*
> > > > +* Set the func transition states so klp_ftrace_handler() will 
> > > > know to
> > > > +* switch to the transition logic.
> > > > +*
> > > > +* When patching, the funcs aren't yet in the func_stack and 
> > > > will be
> > > > +* made visible to the ftrace handler shortly by the calls to
> > > > +* klp_patch_object().
> > > > +*
> > > > +* When unpatching, the funcs are already in the func_stack and 
> > > > so are
> > > > +* already visible to the ftrace handler.
> > > > +*/
> > > > +   klp_for_each_object(patch, obj)
> > > > +   klp_for_each_func(obj, func)
> > > > +   func->transition = true;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Start the transition to the specified target patch state so tasks 
> > > > can begin
> > > > + * switching to it.
> > > > + */
> > > > +void klp_start_transition(void)
> > > > +{
> > > > +   struct task_struct *g, *task;
> > > > +  

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-22 Thread Petr Mladek
On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > > 
> > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > 
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.c
> > > +/*
> > > + * Initialize the global target patch state and all tasks to the initial 
> > > patch
> > > + * state, and initialize all function transition states to true in 
> > > preparation
> > > + * for patching or unpatching.
> > > + */
> > > +void klp_init_transition(struct klp_patch *patch, int state)
> > > +{
> > > + struct task_struct *g, *task;
> > > + unsigned int cpu;
> > > + struct klp_object *obj;
> > > + struct klp_func *func;
> > > + int initial_state = !state;
> > > +
> > > + WARN_ON_ONCE(klp_target_state != KLP_UNDEFINED);
> > > +
> > > + klp_transition_patch = patch;
> > > +
> > > + /*
> > > +  * Set the global target patch state which tasks will switch to.  This
> > > +  * has no effect until the TIF_PATCH_PENDING flags get set later.
> > > +  */
> > > + klp_target_state = state;
> > > +
> > > + /*
> > > +  * If the patch can be applied or reverted immediately, skip the
> > > +  * per-task transitions.
> > > +  */
> > > + if (patch->immediate)
> > > + return;
> > > +
> > > + /*
> > > +  * Initialize all tasks to the initial patch state to prepare them for
> > > +  * switching to the target state.
> > > +  */
> > > + read_lock(_lock);
> > > + for_each_process_thread(g, task) {
> > > + WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > + task->patch_state = initial_state;
> > > + }
> > > + read_unlock(_lock);
> > > +
> > > + /*
> > > +  * Ditto for the idle "swapper" tasks.
> > > +  */
> > > + get_online_cpus();
> > > + for_each_online_cpu(cpu) {
> > > + task = idle_task(cpu);
> > > + WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > + task->patch_state = initial_state;
> > > + }
> > > + put_online_cpus();
> > 
> > We allow to add/remove CPUs here. I am afraid that we will also need
> > to add a cpu coming/going handler that will set the task->patch_state
> > the right way. We must not set the klp_target_state until all ftrace
> > handlers are ready.
> 
> What if we instead just change the above to use for_each_possible_cpu()?
> We could do the same in klp_complete_transition().

I like this idea. It seems that there is idle task for each possible
cpu, see idle_threads_init().

IMHO, we should do the same everytime we do anything with the idle
tasks. I mean in klp_start_transition, klp_try_complete_transition()
and also complete_transition().

Then they will be handled like any other processes and we do not need
to think of any special races.


> > > + /*
> > > +  * Enforce the order of the task->patch_state initializations and the
> > > +  * func->transition updates to ensure that, in the enable path,
> > > +  * klp_ftrace_handler() doesn't see a func in transition with a
> > > +  * task->patch_state of KLP_UNDEFINED.
> > > +  */
> > > + smp_wmb();
> > > +
> > > + /*
> > > +  * Set the func transition states so klp_ftrace_handler() will know to
> > > +  * switch to the transition logic.
> > > +  *
> > > +  * When patching, the funcs aren't yet in the func_stack and will be
> > > +  * made visible to the ftrace handler shortly by the calls to
> > > +  * klp_patch_object().
> > > +  *
> > > +  * When unpatching, the funcs are already in the func_stack and so are
> > > +  * already visible to the ftrace handler.
> > > +  */
> > > + klp_for_each_object(patch, obj)
> > > + klp_for_each_func(obj, func)
> > > + func->transition = true;
> > > +}
> > > +
> > > +/*
> > > + * Start the transition to the specified target patch state so tasks can 
> > > begin
> > > + * switching to it.
> > > + */
> > > +void klp_start_transition(void)
> > > +{
> > > + struct task_struct *g, *task;
> > > + unsigned int cpu;
> > > +
> > > + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > > +
> > > + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > > +   klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > > +
> > > + /*
> > > +  * If the patch can be applied or reverted immediately, skip the
> > > +  * per-task transitions.
> > > +  */
> > > + if (klp_transition_patch->immediate)
> > > + return;
> > > +
> > > + /*
> > > +  * Mark all normal tasks as needing a patch state update.  As they pass
> > > +  * through the syscall barrier they'll switch over to the target state
> > > +  * (unless we switch them in klp_try_complete_transition() first).
> > > +  */
> > > + read_lock(_lock);

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-21 Thread Josh Poimboeuf
On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > 
> > Signed-off-by: Josh Poimboeuf 
> > ---
> > diff --git a/Documentation/livepatch/livepatch.txt 
> > b/Documentation/livepatch/livepatch.txt
> > index 6c43f6e..f87e742 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> 
> I like the description.
> 
> Just a note that we will also need to review the section about
> limitations. But I am not sure that we want to do it in this patch.
> It might open a long discussion on its own.
> 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 1a5a93c..8e06fe5 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -28,18 +28,40 @@
> >  
> >  #include 
> >  
> > +/* task patch states */
> > +#define KLP_UNDEFINED  -1
> > +#define KLP_UNPATCHED   0
> > +#define KLP_PATCHED 1
> > +
> >  /**
> >   * struct klp_func - function structure for live patching
> >   * @old_name:  name of the function to be patched
> >   * @new_func:  pointer to the patched function code
> >   * @old_sympos: a hint indicating which symbol position the old function
> >   * can be found (optional)
> > + * @immediate:  patch the func immediately, bypassing backtrace safety 
> > checks
> 
> There are more checks possible. I would use the same description
> as for klp_object.

Agreed.

> >   * @old_addr:  the address of the function being patched
> >   * @kobj:  kobject for sysfs resources
> >   * @stack_node:list node for klp_ops func_stack list
> >   * @old_size:  size of the old function
> >   * @new_size:  size of the new function
> >   * @patched:   the func has been added to the klp_ops list
> > + * @transition:the func is currently being applied or reverted
> > + *
> > @@ -86,6 +110,7 @@ struct klp_object {
> >   * struct klp_patch - patch structure for live patching
> >   * @mod:   reference to the live patch module
> >   * @objs:  object entries for kernel objects to be patched
> > + * @immediate:  patch all funcs immediately, bypassing safety mechanisms
> >   * @list:  list node for global list of registered patches
> >   * @kobj:  kobject for sysfs resources
> >   * @enabled:   the patch is enabled (but operation may be incomplete)
> 
> [...]
> 
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index fc160c6..22c0c01 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -424,7 +477,10 @@ static ssize_t enabled_store(struct kobject *kobj, 
> > struct kobj_attribute *attr,
> > goto err;
> > }
> >  
> > -   if (enabled) {
> > +   if (patch == klp_transition_patch) {
> > +   klp_reverse_transition();
> > +   mod_delayed_work(system_wq, _transition_work, 0);
> 
> I would put this mod_delayed_work() into klp_reverse_transition().
> Also I would put that schedule_delayed_work() into
> klp_try_complete_transition().
> 
> If I did not miss anything, it will allow to move the
> klp_transition_work code to transition.c where it logically
> belongs.

Makes sense, I'll see if I can move all the klp_transition_work code to
transition.c.

> > +   } else if (enabled) {
> > ret = __klp_enable_patch(patch);
> > if (ret)
> > goto err;
> 
> [...]
> 
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 5efa262..e79ebb5 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  static LIST_HEAD(klp_ops);
> >  
> > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> >  {
> > struct klp_ops *ops;
> > struct klp_func *func;
> > +   int patch_state;
> >  
> > ops = container_of(fops, struct klp_ops, fops);
> >  
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(>func_stack, struct klp_func,
> >   stack_node);
> > -   if (WARN_ON_ONCE(!func))
> > +
> > +   if (!func)
> > goto unlock;
> 
> Why do you removed the WARN_ON_ONCE(), please?
> 
> We still add the function on the stack before registering
> the ftrace handler. Also we unregister the ftrace handler
> before removing the the last entry from the stack.
> 
> AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> to make sure that no-one is inside the handler once finished.
> Mirek knows more about it.

Hm, 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-20 Thread Petr Mladek
On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> 
> Signed-off-by: Josh Poimboeuf 
> ---
> diff --git a/Documentation/livepatch/livepatch.txt 
> b/Documentation/livepatch/livepatch.txt
> index 6c43f6e..f87e742 100644
> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt

I like the description.

Just a note that we will also need to review the section about
limitations. But I am not sure that we want to do it in this patch.
It might open a long discussion on its own.

> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 1a5a93c..8e06fe5 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -28,18 +28,40 @@
>  
>  #include 
>  
> +/* task patch states */
> +#define KLP_UNDEFINED-1
> +#define KLP_UNPATCHED 0
> +#define KLP_PATCHED   1
> +
>  /**
>   * struct klp_func - function structure for live patching
>   * @old_name:name of the function to be patched
>   * @new_func:pointer to the patched function code
>   * @old_sympos: a hint indicating which symbol position the old function
>   *   can be found (optional)
> + * @immediate:  patch the func immediately, bypassing backtrace safety checks

There are more checks possible. I would use the same description
as for klp_object.


>   * @old_addr:the address of the function being patched
>   * @kobj:kobject for sysfs resources
>   * @stack_node:  list node for klp_ops func_stack list
>   * @old_size:size of the old function
>   * @new_size:size of the new function
>   * @patched: the func has been added to the klp_ops list
> + * @transition:  the func is currently being applied or reverted
> + *
> @@ -86,6 +110,7 @@ struct klp_object {
>   * struct klp_patch - patch structure for live patching
>   * @mod: reference to the live patch module
>   * @objs:object entries for kernel objects to be patched
> + * @immediate:  patch all funcs immediately, bypassing safety mechanisms
>   * @list:list node for global list of registered patches
>   * @kobj:kobject for sysfs resources
>   * @enabled: the patch is enabled (but operation may be incomplete)

[...]

> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index fc160c6..22c0c01 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -424,7 +477,10 @@ static ssize_t enabled_store(struct kobject *kobj, 
> struct kobj_attribute *attr,
>   goto err;
>   }
>  
> - if (enabled) {
> + if (patch == klp_transition_patch) {
> + klp_reverse_transition();
> + mod_delayed_work(system_wq, _transition_work, 0);

I would put this mod_delayed_work() into klp_reverse_transition().
Also I would put that schedule_delayed_work() into
klp_try_complete_transition().

If I did not miss anything, it will allow to move the
klp_transition_work code to transition.c where it logically
belongs.

> + } else if (enabled) {
>   ret = __klp_enable_patch(patch);
>   if (ret)
>   goto err;

[...]

> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 5efa262..e79ebb5 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  {
>   struct klp_ops *ops;
>   struct klp_func *func;
> + int patch_state;
>  
>   ops = container_of(fops, struct klp_ops, fops);
>  
>   rcu_read_lock();
> +
>   func = list_first_or_null_rcu(>func_stack, struct klp_func,
> stack_node);
> - if (WARN_ON_ONCE(!func))
> +
> + if (!func)
>   goto unlock;

Why do you removed the WARN_ON_ONCE(), please?

We still add the function on the stack before registering
the ftrace handler. Also we unregister the ftrace handler
before removing the the last entry from the stack.

AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
to make sure that no-one is inside the handler once finished.
Mirek knows more about it.

If this is not true, we have a problem. For example,
we call kfree(ops) after unregister_ftrace_function();

BTW: I thought that this change was really needed because of
klp_try_complete_transition(). But I think that the WARN
could and should stay after all. See below.


> + /*
> +  * Enforce the order of the ops->func_stack and func->transition reads.
> +  

[PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-08 Thread Josh Poimboeuf
Change livepatch to use a basic per-task consistency model.  This is the
foundation which will eventually enable us to patch those ~10% of
security patches which change function or data semantics.  This is the
biggest remaining piece needed to make livepatch more generally useful.

This code stems from the design proposal made by Vojtech [1] in November
2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
consistency and syscall barrier switching combined with kpatch's stack
trace switching.  There are also a number of fallback options which make
it quite flexible.

Patches are applied on a per-task basis, when the task is deemed safe to
switch over.  When a patch is enabled, livepatch enters into a
transition state where tasks are converging to the patched state.
Usually this transition state can complete in a few seconds.  The same
sequence occurs when a patch is disabled, except the tasks converge from
the patched state to the unpatched state.

An interrupt handler inherits the patched state of the task it
interrupts.  The same is true for forked tasks: the child inherits the
patched state of the parent.

Livepatch uses several complementary approaches to determine when it's
safe to patch tasks:

1. The first and most effective approach is stack checking of sleeping
   tasks.  If no affected functions are on the stack of a given task,
   the task is patched.  In most cases this will patch most or all of
   the tasks on the first try.  Otherwise it'll keep trying
   periodically.  This option is only available if the architecture has
   reliable stacks (HAVE_RELIABLE_STACKTRACE).

2. The second approach, if needed, is kernel exit switching.  A
   task is switched when it returns to user space from a system call, a
   user space IRQ, or a signal.  It's useful in the following cases:

   a) Patching I/O-bound user tasks which are sleeping on an affected
  function.  In this case you have to send SIGSTOP and SIGCONT to
  force it to exit the kernel and be patched.
   b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
  then it will get patched the next time it gets interrupted by an
  IRQ.
   c) In the future it could be useful for applying patches for
  architectures which don't yet have HAVE_RELIABLE_STACKTRACE.  In
  this case you would have to signal most of the tasks on the
  system.  However this isn't supported yet because there's
  currently no way to patch kthreads without
  HAVE_RELIABLE_STACKTRACE.

3. For idle "swapper" tasks, since they don't ever exit the kernel, they
   instead have a klp_update_patch_state() call in the idle loop which
   allows them to be patched before the CPU enters the idle state.

   (Note there's not yet such an approach for kthreads.)

All the above approaches may be skipped by setting the 'immediate' flag
in the 'klp_patch' struct, which will disable per-task consistency and
patch all tasks immediately.  This can be useful if the patch doesn't
change any function or data semantics.  Note that, even with this flag
set, it's possible that some tasks may still be running with an old
version of the function, until that function returns.

There's also an 'immediate' flag in the 'klp_func' struct which allows
you to specify that certain functions in the patch can be applied
without per-task consistency.  This might be useful if you want to patch
a common function like schedule(), and the function change doesn't need
consistency but the rest of the patch does.

For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
must set patch->immediate which causes all tasks to be patched
immediately.  This option should be used with care, only when the patch
doesn't change any function or data semantics.

In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
may be allowed to use per-task consistency if we can come up with
another way to patch kthreads.

The /sys/kernel/livepatch//transition file shows whether a patch
is in transition.  Only a single patch (the topmost patch on the stack)
can be in transition at a given time.  A patch can remain in transition
indefinitely, if any of the tasks are stuck in the initial patch state.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch//enabled file while
the transition is in progress.  Then all the tasks will attempt to
converge back to the original patch state.

[1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz

Signed-off-by: Josh Poimboeuf 
---
 Documentation/ABI/testing/sysfs-kernel-livepatch |   8 +
 Documentation/livepatch/livepatch.txt| 127 +-
 include/linux/init_task.h|   9 +
 include/linux/livepatch.h|  40 +-
 include/linux/sched.h|   3 +
 kernel/fork.c|   3 +
 kernel/livepatch/Makefile|