Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-06-06 Thread Josh Poimboeuf
On Mon, Jun 06, 2016 at 03:54:41PM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 000..92819bb
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > +/*
> > + * Try to safely switch a task to the target patch state.  If it's 
> > currently
> > + * running, or it's sleeping on a to-be-patched or to-be-unpatched 
> > function, or
> > + * if the stack is unreliable, return false.
> > + */
> > +static bool klp_try_switch_task(struct task_struct *task)
> > +{
> > +   struct rq *rq;
> > +   unsigned long flags;
> 
> This should be of type "struct rq_flags". Otherwise, I get compilation
> warnings:
> 
> kernel/livepatch/transition.c: In function ‘klp_try_switch_task’:
> kernel/livepatch/transition.c:349:2: warning: passing argument 2 of 
> ‘task_rq_lock’ from incompatible pointer type [enabled by default]
>   rq = task_rq_lock(task, );
>   ^
> In file included from kernel/livepatch/transition.c:24:0:
> kernel/livepatch/../sched/sched.h:1468:12: note: expected ‘struct rq_flags *’ 
> but argument is of type ‘long unsigned int *’
>  struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
> ^
> kernel/livepatch/transition.c:367:2: warning: passing argument 3 of 
> ‘task_rq_unlock’ from incompatible pointer type [enabled by default]
>   task_rq_unlock(rq, task, );
>   ^
> In file included from kernel/livepatch/transition.c:24:0:
> kernel/livepatch/../sched/sched.h:1480:1: note: expected ‘struct rq_flags *’ 
> but argument is of type ‘long unsigned int *’
>  task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
> 
> 
> And even runtime warnings from lockdep:
> 
> [  212.847548] WARNING: CPU: 1 PID: 3847 at kernel/locking/lockdep.c:3532 
> lock_release+0x431/0x480
> [  212.847549] releasing a pinned lock
> [  212.847550] Modules linked in: livepatch_sample(E+)
> [  212.847555] CPU: 1 PID: 3847 Comm: modprobe Tainted: GE K 
> 4.7.0-rc1-next-20160602-4-default+ #336
> [  212.847556] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Bochs 01/01/2011
> [  212.847558]   880139823aa0 814388dc 
> 880139823af0
> [  212.847562]   880139823ae0 8106fad1 
> 0dcc82b11390
> [  212.847565]  88013fc978d8 810eea1e 8800ba0ed6d0 
> 0003
> [  212.847569] Call Trace:
> [  212.847572]  [] dump_stack+0x85/0xc9
> [  212.847575]  [] __warn+0xd1/0xf0
> [  212.847578]  [] ? klp_try_switch_task.part.3+0x5e/0x2b0
> [  212.847580]  [] warn_slowpath_fmt+0x4f/0x60
> [  212.847582]  [] lock_release+0x431/0x480
> [  212.847585]  [] ? dump_trace+0x118/0x310
> [  212.847588]  [] ? entry_SYSCALL_64_fastpath+0x1f/0xbd
> [  212.847590]  [] _raw_spin_unlock+0x1f/0x30
> [  212.847600]  [] klp_try_switch_task.part.3+0x5e/0x2b0
> [  212.847603]  [] klp_try_complete_transition+0x84/0x190
> [  212.847605]  [] __klp_enable_patch+0xb0/0x130
> [  212.847607]  [] klp_enable_patch+0x55/0x80
> [  212.847610]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
> [livepatch_sample]
> [  212.847613]  [] livepatch_init+0x31/0x70 
> [livepatch_sample]
> [  212.847615]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
> [livepatch_sample]
> [  212.847617]  [] do_one_initcall+0x3d/0x160
> [  212.847629]  [] ? do_init_module+0x27/0x1e4
> [  212.847632]  [] ? rcu_read_lock_sched_held+0x62/0x70
> [  212.847634]  [] ? kmem_cache_alloc_trace+0x282/0x340
> [  212.847636]  [] do_init_module+0x60/0x1e4
> [  212.847638]  [] load_module+0x1482/0x1d40
> [  212.847640]  [] ? __symbol_put+0x40/0x40
> [  212.847643]  [] SYSC_finit_module+0xa9/0xd0
> [  212.847645]  [] SyS_finit_module+0xe/0x10
> [  212.847647]  [] entry_SYSCALL_64_fastpath+0x1f/0xbd
> [  212.847649] ---[ end trace e4e9f09d45443049 ]---

Thanks, I also saw this when rebasing onto a newer linux-next.

> > +   int ret;
> > +   bool success = false;
> > +
> > +   /* check if this task has already switched over */
> > +   if (task->patch_state == klp_target_state)
> > +   return true;
> > +
> > +   /*
> > +* For arches which don't have reliable stack traces, we have to rely
> > +* on other methods (e.g., switching tasks at the syscall barrier).
> > +*/
> > +   if (!IS_ENABLED(CONFIG_RELIABLE_STACKTRACE))
> > +   return false;
> > +
> > +   /*
> > +* Now try to check the stack for any to-be-patched or to-be-unpatched
> > +* functions.  If all goes well, switch the task to the target patch
> > +* state.
> > +*/
> > +   rq = task_rq_lock(task, );
> > +
> > +   if (task_running(rq, task) && 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-06-06 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.

> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 000..92819bb
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> +/*
> + * Try to safely switch a task to the target patch state.  If it's currently
> + * running, or it's sleeping on a to-be-patched or to-be-unpatched function, 
> or
> + * if the stack is unreliable, return false.
> + */
> +static bool klp_try_switch_task(struct task_struct *task)
> +{
> + struct rq *rq;
> + unsigned long flags;

This should be of type "struct rq_flags". Otherwise, I get compilation
warnings:

kernel/livepatch/transition.c: In function ‘klp_try_switch_task’:
kernel/livepatch/transition.c:349:2: warning: passing argument 2 of 
‘task_rq_lock’ from incompatible pointer type [enabled by default]
  rq = task_rq_lock(task, );
  ^
In file included from kernel/livepatch/transition.c:24:0:
kernel/livepatch/../sched/sched.h:1468:12: note: expected ‘struct rq_flags *’ 
but argument is of type ‘long unsigned int *’
 struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
^
kernel/livepatch/transition.c:367:2: warning: passing argument 3 of 
‘task_rq_unlock’ from incompatible pointer type [enabled by default]
  task_rq_unlock(rq, task, );
  ^
In file included from kernel/livepatch/transition.c:24:0:
kernel/livepatch/../sched/sched.h:1480:1: note: expected ‘struct rq_flags *’ 
but argument is of type ‘long unsigned int *’
 task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)


And even runtime warnings from lockdep:

[  212.847548] WARNING: CPU: 1 PID: 3847 at kernel/locking/lockdep.c:3532 
lock_release+0x431/0x480
[  212.847549] releasing a pinned lock
[  212.847550] Modules linked in: livepatch_sample(E+)
[  212.847555] CPU: 1 PID: 3847 Comm: modprobe Tainted: GE K 
4.7.0-rc1-next-20160602-4-default+ #336
[  212.847556] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
[  212.847558]   880139823aa0 814388dc 
880139823af0
[  212.847562]   880139823ae0 8106fad1 
0dcc82b11390
[  212.847565]  88013fc978d8 810eea1e 8800ba0ed6d0 
0003
[  212.847569] Call Trace:
[  212.847572]  [] dump_stack+0x85/0xc9
[  212.847575]  [] __warn+0xd1/0xf0
[  212.847578]  [] ? klp_try_switch_task.part.3+0x5e/0x2b0
[  212.847580]  [] warn_slowpath_fmt+0x4f/0x60
[  212.847582]  [] lock_release+0x431/0x480
[  212.847585]  [] ? dump_trace+0x118/0x310
[  212.847588]  [] ? entry_SYSCALL_64_fastpath+0x1f/0xbd
[  212.847590]  [] _raw_spin_unlock+0x1f/0x30
[  212.847600]  [] klp_try_switch_task.part.3+0x5e/0x2b0
[  212.847603]  [] klp_try_complete_transition+0x84/0x190
[  212.847605]  [] __klp_enable_patch+0xb0/0x130
[  212.847607]  [] klp_enable_patch+0x55/0x80
[  212.847610]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
[livepatch_sample]
[  212.847613]  [] livepatch_init+0x31/0x70 [livepatch_sample]
[  212.847615]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
[livepatch_sample]
[  212.847617]  [] do_one_initcall+0x3d/0x160
[  212.847629]  [] ? do_init_module+0x27/0x1e4
[  212.847632]  [] ? rcu_read_lock_sched_held+0x62/0x70
[  212.847634]  [] ? kmem_cache_alloc_trace+0x282/0x340
[  212.847636]  [] do_init_module+0x60/0x1e4
[  212.847638]  [] load_module+0x1482/0x1d40
[  212.847640]  [] ? __symbol_put+0x40/0x40
[  212.847643]  [] SYSC_finit_module+0xa9/0xd0
[  212.847645]  [] SyS_finit_module+0xe/0x10
[  212.847647]  [] entry_SYSCALL_64_fastpath+0x1f/0xbd
[  212.847649] ---[ end trace e4e9f09d45443049 ]---


> + int ret;
> + bool success = false;
> +
> + /* check if this task has already switched over */
> + if (task->patch_state == klp_target_state)
> + return true;
> +
> + /*
> +  * For arches which don't have reliable stack traces, we have to rely
> +  * on other methods (e.g., switching tasks at the syscall barrier).
> +  */
> + if (!IS_ENABLED(CONFIG_RELIABLE_STACKTRACE))
> + return false;
> +
> + /*
> +  * Now try to check the stack for any to-be-patched or to-be-unpatched
> +  * functions.  If all goes well, switch the task to the target patch
> +  * state.
> +  */
> + rq = task_rq_lock(task, );
> +
> + if (task_running(rq, task) && task != current) {
> + pr_debug("%s: pid %d (%s) is running\n", __func__, task->pid,
> +  task->comm);

Also I think about using printk_deferred() inside the rq_lock but
it is not strictly needed. Also we use only pr_debug() here which
is a NOP when not enabled.

Best Regards,
Petr

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-16 Thread Josh Poimboeuf
On Mon, May 09, 2016 at 11:41:37AM +0200, Miroslav Benes wrote:
> > +void klp_init_transition(struct klp_patch *patch, int state)
> > +{
> > +   struct task_struct *g, *task;
> > +   unsigned int cpu;
> > +   struct klp_object *obj;
> > +   struct klp_func *func;
> > +   int initial_state = !state;
> > +
> > +   klp_transition_patch = patch;
> > +
> > +   /*
> > +* If the patch can be applied or reverted immediately, skip the
> > +* per-task transitions.
> > +*/
> > +   if (patch->immediate)
> > +   return;
> > +
> > +   /*
> > +* Initialize all tasks to the initial patch state to prepare them for
> > +* switching to the target state.
> > +*/
> > +   read_lock(_lock);
> > +   for_each_process_thread(g, task)
> > +   task->patch_state = initial_state;
> > +   read_unlock(_lock);
> > +
> > +   /*
> > +* Ditto for the idle "swapper" tasks.
> > +*/
> > +   get_online_cpus();
> > +   for_each_online_cpu(cpu)
> > +   idle_task(cpu)->patch_state = initial_state;
> > +   put_online_cpus();
> > +
> > +   /*
> > +* Ensure klp_ftrace_handler() sees the task->patch_state updates
> > +* before the func->transition updates.  Otherwise it could read an
> > +* out-of-date task state and pick the wrong function.
> > +*/
> > +   smp_wmb();
> > +
> > +   /*
> > +* Set the func transition states so klp_ftrace_handler() will know to
> > +* switch to the transition logic.
> > +*
> > +* When patching, the funcs aren't yet in the func_stack and will be
> > +* made visible to the ftrace handler shortly by the calls to
> > +* klp_patch_object().
> > +*
> > +* When unpatching, the funcs are already in the func_stack and so are
> > +* already visible to the ftrace handler.
> > +*/
> > +   klp_for_each_object(patch, obj)
> > +   klp_for_each_func(obj, func)
> > +   func->transition = true;
> > +
> > +   /*
> > +* Set the global target patch state which tasks will switch to.  This
> > +* has no effect until the TIF_PATCH_PENDING flags get set later.
> > +*/
> > +   klp_target_state = state;
> 
> I am afraid there is a problem for (patch->immediate == true) patches. 
> klp_target_state is not set for those and the comment is not entirely 
> true, because klp_target_state has an effect in several places.

Ah, you're right.  I moved this assignment here for v2.  It was
originally done before the patch->immediate check.  If I remember
correctly, I moved it closer to the barrier for better readability (but
I created a bug in the process).

> I guess we need to set klp_target_state even for immediate patches. Should 
> we also initialize it with KLP_UNDEFINED and set it to KLP_UNDEFINED in 
> klp_complete_transition()?

Yes, to both.

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-10 Thread Miroslav Benes
On Thu, 28 Apr 2016, Josh Poimboeuf wrote:

> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> This code stems from the design proposal made by Vojtech [1] in November
> 2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
> consistency and syscall barrier switching combined with kpatch's stack
> trace switching.  There are also a number of fallback options which make
> it quite flexible.
> 
> Patches are applied on a per-task basis, when the task is deemed safe to
> switch over.  When a patch is enabled, livepatch enters into a
> transition state where tasks are converging to the patched state.
> Usually this transition state can complete in a few seconds.  The same
> sequence occurs when a patch is disabled, except the tasks converge from
> the patched state to the unpatched state.
> 
> An interrupt handler inherits the patched state of the task it
> interrupts.  The same is true for forked tasks: the child inherits the
> patched state of the parent.
> 
> Livepatch uses several complementary approaches to determine when it's
> safe to patch tasks:
> 
> 1. The first and most effective approach is stack checking of sleeping
>tasks.  If no affected functions are on the stack of a given task,
>the task is patched.  In most cases this will patch most or all of
>the tasks on the first try.  Otherwise it'll keep trying
>periodically.  This option is only available if the architecture has
>reliable stacks (CONFIG_RELIABLE_STACKTRACE and
>CONFIG_STACK_VALIDATION).
> 
> 2. The second approach, if needed, is kernel exit switching.  A
>task is switched when it returns to user space from a system call, a
>user space IRQ, or a signal.  It's useful in the following cases:
> 
>a) Patching I/O-bound user tasks which are sleeping on an affected
>   function.  In this case you have to send SIGSTOP and SIGCONT to
>   force it to exit the kernel and be patched.
>b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
>   then it will get patched the next time it gets interrupted by an
>   IRQ.
>c) Applying patches for architectures which don't yet have
>   CONFIG_RELIABLE_STACKTRACE.  In this case you'll have to signal
>   most of the tasks on the system.  However this isn't a complete
>   solution, because there's currently no way to patch kthreads
>   without CONFIG_RELIABLE_STACKTRACE.
> 
>Note: since idle "swapper" tasks don't ever exit the kernel, they
>instead have a kpatch_patch_task() call in the idle loop which allows

s/kpatch_patch_task()/klp_patch_task()/

[...]

> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
> @@ -72,7 +72,8 @@ example, they add a NULL pointer or a boundary check, fix a 
> race by adding
>  a missing memory barrier, or add some locking around a critical section.
>  Most of these changes are self contained and the function presents itself
>  the same way to the rest of the system. In this case, the functions might
> -be updated independently one by one.
> +be updated independently one by one.  (This can be done by setting the
> +'immediate' flag in the klp_patch struct.)
>  
>  But there are more complex fixes. For example, a patch might change
>  ordering of locking in multiple functions at the same time. Or a patch
> @@ -86,20 +87,103 @@ or no data are stored in the modified structures at the 
> moment.
>  The theory about how to apply functions a safe way is rather complex.
>  The aim is to define a so-called consistency model. It attempts to define
>  conditions when the new implementation could be used so that the system
> -stays consistent. The theory is not yet finished. See the discussion at
> -http://thread.gmane.org/gmane.linux.kernel/1823033/focus=1828189
> -
> -The current consistency model is very simple. It guarantees that either
> -the old or the new function is called. But various functions get redirected
> -one by one without any synchronization.
> -
> -In other words, the current implementation _never_ modifies the behavior
> -in the middle of the call. It is because it does _not_ rewrite the entire
> -function in the memory. Instead, the function gets redirected at the
> -very beginning. But this redirection is used immediately even when
> -some other functions from the same patch have not been redirected yet.
> -
> -See also the section "Limitations" below.
> +stays consistent.
> +
> +Livepatch has a consistency model which is a hybrid of kGraft and
> +kpatch:  it uses kGraft's per-task consistency and syscall barrier
> +switching combined with kpatch's stack trace switching.  There are also
> +a number of fallback options which make it quite flexible.
> +
> +Patches are applied on a 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-09 Thread Miroslav Benes

[...]

> +static int klp_target_state;

[...]

> +void klp_init_transition(struct klp_patch *patch, int state)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> + struct klp_object *obj;
> + struct klp_func *func;
> + int initial_state = !state;
> +
> + klp_transition_patch = patch;
> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (patch->immediate)
> + return;
> +
> + /*
> +  * Initialize all tasks to the initial patch state to prepare them for
> +  * switching to the target state.
> +  */
> + read_lock(_lock);
> + for_each_process_thread(g, task)
> + task->patch_state = initial_state;
> + read_unlock(_lock);
> +
> + /*
> +  * Ditto for the idle "swapper" tasks.
> +  */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + idle_task(cpu)->patch_state = initial_state;
> + put_online_cpus();
> +
> + /*
> +  * Ensure klp_ftrace_handler() sees the task->patch_state updates
> +  * before the func->transition updates.  Otherwise it could read an
> +  * out-of-date task state and pick the wrong function.
> +  */
> + smp_wmb();
> +
> + /*
> +  * Set the func transition states so klp_ftrace_handler() will know to
> +  * switch to the transition logic.
> +  *
> +  * When patching, the funcs aren't yet in the func_stack and will be
> +  * made visible to the ftrace handler shortly by the calls to
> +  * klp_patch_object().
> +  *
> +  * When unpatching, the funcs are already in the func_stack and so are
> +  * already visible to the ftrace handler.
> +  */
> + klp_for_each_object(patch, obj)
> + klp_for_each_func(obj, func)
> + func->transition = true;
> +
> + /*
> +  * Set the global target patch state which tasks will switch to.  This
> +  * has no effect until the TIF_PATCH_PENDING flags get set later.
> +  */
> + klp_target_state = state;

I am afraid there is a problem for (patch->immediate == true) patches. 
klp_target_state is not set for those and the comment is not entirely 
true, because klp_target_state has an effect in several places.

[...]

> +void klp_start_transition(void)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +   klp_target_state == KLP_PATCHED ? "patching" : "unpatching");

Here...

> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (klp_transition_patch->immediate)
> + return;
> +

[...]

> +bool klp_try_complete_transition(void)
> +{
> + unsigned int cpu;
> + struct task_struct *g, *task;
> + bool complete = true;
> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (klp_transition_patch->immediate)
> + goto success;
> +
> + /*
> +  * Try to switch the tasks to the target patch state by walking their
> +  * stacks and looking for any to-be-patched or to-be-unpatched
> +  * functions.  If such functions are found on a stack, or if the stack
> +  * is deemed unreliable, the task can't be switched yet.
> +  *
> +  * Usually this will transition most (or all) of the tasks on a system
> +  * unless the patch includes changes to a very common function.
> +  */
> + read_lock(_lock);
> + for_each_process_thread(g, task)
> + if (!klp_try_switch_task(task))
> + complete = false;
> + read_unlock(_lock);
> +
> + /*
> +  * Ditto for the idle "swapper" tasks.
> +  */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + if (!klp_try_switch_task(idle_task(cpu)))
> + complete = false;
> + put_online_cpus();
> +
> + /*
> +  * Some tasks weren't able to be switched over.  Try again later and/or
> +  * wait for other methods like syscall barrier switching.
> +  */
> + if (!complete)
> + return false;
> +
> +success:
> + /*
> +  * When unpatching, all tasks have transitioned to KLP_UNPATCHED so we
> +  * can now remove the new functions from the func_stack.
> +  */
> + if (klp_target_state == KLP_UNPATCHED) {

Here (this is the most important one I think).

> + klp_unpatch_objects(klp_transition_patch);
> +
> + /*
> +  * Don't allow any existing instances of ftrace handlers to
> +  * access any obsolete funcs before we reset the func
> +  * transition states to false.  Otherwise the handler may see
> +  * the deleted "new" func, see that it's not in transition, and
> +  * 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-06 Thread Josh Poimboeuf
On Fri, May 06, 2016 at 01:33:01PM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 782fbb5..b3b8639 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  static LIST_HEAD(klp_ops);
> >  
> > @@ -58,11 +59,42 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > ops = container_of(fops, struct klp_ops, fops);
> >  
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(>func_stack, struct klp_func,
> >   stack_node);
> > -   if (WARN_ON_ONCE(!func))
> > +
> > +   if (!func)
> > goto unlock;
> >  
> > +   /*
> > +* See the comment for the 2nd smp_wmb() in klp_init_transition() for
> > +* an explanation of why this read barrier is needed.
> > +*/
> > +   smp_rmb();
> > +
> > +   if (unlikely(func->transition)) {
> > +
> > +   /*
> > +* See the comment for the 1st smp_wmb() in
> > +* klp_init_transition() for an explanation of why this read
> > +* barrier is needed.
> > +*/
> > +   smp_rmb();
> 
> I would add here:
> 
>   WARN_ON_ONCE(current->patch_state == KLP_UNDEFINED);
> 
> We do not know in which context this is called, so the printk's are
> not ideal. But it will get triggered only if there is a bug in
> the livepatch implementation. It should happen on random locations
> and rather early when a bug is introduced.
> 
> Anyway, better to die and catch the bug that let the system running
> in an undefined state and produce cryptic errors later on.

Ok, makes sense.

> > +   if (current->patch_state == KLP_UNPATCHED) {
> > +   /*
> > +* Use the previously patched version of the function.
> > +* If no previous patches exist, use the original
> > +* function.
> > +*/
> > +   func = list_entry_rcu(func->stack_node.next,
> > + struct klp_func, stack_node);
> > +
> > +   if (>stack_node == >func_stack)
> > +   goto unlock;
> > +   }
> > +   }
> 
> I am staring into the code for too long now. I need to step back for a
> while. I'll do another look when you send the next version. Anyway,
> you did a great work. I speak mainly for the livepatch part and
> I like it.

Thanks for the helpful reviews!  I'll be on vacation again next week so
I get a break too :-)

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-06 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 782fbb5..b3b8639 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -58,11 +59,42 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>   ops = container_of(fops, struct klp_ops, fops);
>  
>   rcu_read_lock();
> +
>   func = list_first_or_null_rcu(>func_stack, struct klp_func,
> stack_node);
> - if (WARN_ON_ONCE(!func))
> +
> + if (!func)
>   goto unlock;
>  
> + /*
> +  * See the comment for the 2nd smp_wmb() in klp_init_transition() for
> +  * an explanation of why this read barrier is needed.
> +  */
> + smp_rmb();
> +
> + if (unlikely(func->transition)) {
> +
> + /*
> +  * See the comment for the 1st smp_wmb() in
> +  * klp_init_transition() for an explanation of why this read
> +  * barrier is needed.
> +  */
> + smp_rmb();

I would add here:

WARN_ON_ONCE(current->patch_state == KLP_UNDEFINED);

We do not know in which context this is called, so the printk's are
not ideal. But it will get triggered only if there is a bug in
the livepatch implementation. It should happen on random locations
and rather early when a bug is introduced.

Anyway, better to die and catch the bug that let the system running
in an undefined state and produce cryptic errors later on.


> + if (current->patch_state == KLP_UNPATCHED) {
> + /*
> +  * Use the previously patched version of the function.
> +  * If no previous patches exist, use the original
> +  * function.
> +  */
> + func = list_entry_rcu(func->stack_node.next,
> +   struct klp_func, stack_node);
> +
> + if (>stack_node == >func_stack)
> + goto unlock;
> + }
> + }

I am staring into the code for too long now. I need to step back for a
while. I'll do another look when you send the next version. Anyway,
you did a great work. I speak mainly for the livepatch part and
I like it.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Petr Mladek
On Wed 2016-05-04 10:51:21, Josh Poimboeuf wrote:
> On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote:
> > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -76,6 +76,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  #include 
> > > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned 
> > > long clone_flags,
> > >   p->parent_exec_id = current->self_exec_id;
> > >   }
> > >  
> > > + klp_copy_process(p);
> > 
> > I am in doubts here. We copy the state from the parent here. It means
> > that the new process might still need to be converted. But at the same
> > point print_context_stack_reliable() returns zero without printing
> > any stack trace when TIF_FORK flag is set. It means that a freshly
> > forked task might get be converted immediately. I seems that boot
> > operations are always done when copy_process() is called. But
> > they are contradicting each other.
> > 
> > I guess that print_context_stack_reliable() should either return
> > -EINVAL when TIF_FORK is set. Or it should try to print the
> > stack of the newly forked task.
> > 
> > Or do I miss something, please?
> 
> Ok, I admit it's confusing.
> 
> A newly forked task doesn't *have* a stack (other than the pt_regs frame
> it needs for the return to user space), which is why
> print_context_stack_reliable() returns success with an empty array of
> addresses.
> 
> For a little background, see the second switch_to() macro in
> arch/x86/include/asm/switch_to.h.  When a newly forked task runs for the
> first time, it returns from __switch_to() with no stack.  It then jumps
> straight to ret_from_fork in entry_64.S, calls a few C functions, and
> eventually returns to user space.  So, assuming we aren't patching entry
> code or the switch_to() macro in __schedule(), it should be safe to
> patch the task before it does all that.

This is great explanation. Thanks for it.

> So, having said all that, I'm really not sure what the best approach is
> for print_context_stack_reliable().  Right now I'm thinking I'll change
> it back to return -EINVAL for a newly forked task, so it'll be more
> future-proof: better to have a false positive than a false negative.
> Either way it will probably need to be changed again if the
> ret_from_fork code gets cleaned up.

I would prefer the -EINVAL. It might safe some hairs when anyone
is working on patching the switch_to stuff. Also it is not that
big loss beacuse most tasks will get migrated on the return to
userspace.

It might help a bit with the newly forked kthreads. But there should
be more safe location where the new kthreads might get migrated,
e.g. right before the main function gets called.


> > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > new file mode 100644
> > > index 000..92819bb
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.c
> > > +/*
> > > + * This function can be called in the middle of an existing transition to
> > > + * reverse the direction of the target patch state.  This can be done to
> > > + * effectively cancel an existing enable or disable operation if there 
> > > are any
> > > + * tasks which are stuck in the initial patch state.
> > > + */
> > > +void klp_reverse_transition(void)
> > > +{
> > > + struct klp_patch *patch = klp_transition_patch;
> > > +
> > > + klp_target_state = !klp_target_state;
> > > +
> > > + /*
> > > +  * Ensure that if another CPU goes through the syscall barrier, sees
> > > +  * the TIF_PATCH_PENDING writes in klp_start_transition(), and calls
> > > +  * klp_patch_task(), it also sees the above write to the target state.
> > > +  * Otherwise it can put the task in the wrong universe.
> > > +  */
> > > + smp_wmb();
> > > +
> > > + klp_start_transition();
> > > + klp_try_complete_transition();
> > 
> > It is a bit strange that we keep the work scheduled. It might be
> > better to use
> > 
> >mod_delayed_work(system_wq, _work, 0);
> 
> True, I think that would be better.
> 
> > Which triggers more ideas from the nitpicking deparment:
> > 
> > I would move the work definition from core.c to transition.c because
> > it is closely related to klp_try_complete_transition();
> 
> That could be good, but there's a slight problem: klp_work_fn() requires
> klp_mutex, which is static to core.c.  It's kind of nice to keep the use
> of the mutex in core.c only.

I see and am surprised that we take the lock only in core.c ;-)

I do not have a strong opinion then. Just a small one. The lock guards
also operations from the other .c files. I think that it is 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Miroslav Benes
On Wed, 4 May 2016, Josh Poimboeuf wrote:

> On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote:
> > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -76,6 +76,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  #include 
> > > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned 
> > > long clone_flags,
> > >   p->parent_exec_id = current->self_exec_id;
> > >   }
> > >  
> > > + klp_copy_process(p);
> > 
> > I am in doubts here. We copy the state from the parent here. It means
> > that the new process might still need to be converted. But at the same
> > point print_context_stack_reliable() returns zero without printing
> > any stack trace when TIF_FORK flag is set. It means that a freshly
> > forked task might get be converted immediately. I seems that boot
> > operations are always done when copy_process() is called. But
> > they are contradicting each other.
> > 
> > I guess that print_context_stack_reliable() should either return
> > -EINVAL when TIF_FORK is set. Or it should try to print the
> > stack of the newly forked task.
> > 
> > Or do I miss something, please?
> 
> Ok, I admit it's confusing.
> 
> A newly forked task doesn't *have* a stack (other than the pt_regs frame
> it needs for the return to user space), which is why
> print_context_stack_reliable() returns success with an empty array of
> addresses.
> 
> For a little background, see the second switch_to() macro in
> arch/x86/include/asm/switch_to.h.  When a newly forked task runs for the
> first time, it returns from __switch_to() with no stack.  It then jumps
> straight to ret_from_fork in entry_64.S, calls a few C functions, and
> eventually returns to user space.  So, assuming we aren't patching entry
> code or the switch_to() macro in __schedule(), it should be safe to
> patch the task before it does all that.
> 
> With the current code, if an unpatched task gets forked, the child will
> also be unpatched.  In theory, we could go ahead and patch the child
> then.  In fact, that's what I did in v1.9.
> 
> But in v1.9 discussions it was pointed out that someday maybe the
> ret_from_fork stuff will get cleaned up and instead the child stack will
> be copied from the parent.  In that case the child should inherit its
> parent's patched state.  So we decided to make it more future-proof by
> having the child inherit the parent's patched state.
> 
> So, having said all that, I'm really not sure what the best approach is
> for print_context_stack_reliable().  Right now I'm thinking I'll change
> it back to return -EINVAL for a newly forked task, so it'll be more
> future-proof: better to have a false positive than a false negative.
> Either way it will probably need to be changed again if the
> ret_from_fork code gets cleaned up.

I'd be for returning -EINVAL. It is a safe play for now.

[...]
 
> > Finally, the following is always called right after
> > klp_start_transition(), so I would call it from there.
> > 
> > if (!klp_try_complete_transition())
> > klp_schedule_work();

On the other hand it is quite nice to see the sequence

init
start
try_complete

there. Just my 2 cents.

> Except for when it's called by klp_reverse_transition().  And it really
> depends on whether we want to allow transition.c to use the mutex.  I
> don't have a strong opinion either way, I may need to think about it
> some more.

Miroslav
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Josh Poimboeuf
On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> 
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -76,6 +76,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned long 
> > clone_flags,
> > p->parent_exec_id = current->self_exec_id;
> > }
> >  
> > +   klp_copy_process(p);
> 
> I am in doubts here. We copy the state from the parent here. It means
> that the new process might still need to be converted. But at the same
> point print_context_stack_reliable() returns zero without printing
> any stack trace when TIF_FORK flag is set. It means that a freshly
> forked task might get be converted immediately. I seems that boot
> operations are always done when copy_process() is called. But
> they are contradicting each other.
> 
> I guess that print_context_stack_reliable() should either return
> -EINVAL when TIF_FORK is set. Or it should try to print the
> stack of the newly forked task.
> 
> Or do I miss something, please?

Ok, I admit it's confusing.

A newly forked task doesn't *have* a stack (other than the pt_regs frame
it needs for the return to user space), which is why
print_context_stack_reliable() returns success with an empty array of
addresses.

For a little background, see the second switch_to() macro in
arch/x86/include/asm/switch_to.h.  When a newly forked task runs for the
first time, it returns from __switch_to() with no stack.  It then jumps
straight to ret_from_fork in entry_64.S, calls a few C functions, and
eventually returns to user space.  So, assuming we aren't patching entry
code or the switch_to() macro in __schedule(), it should be safe to
patch the task before it does all that.

With the current code, if an unpatched task gets forked, the child will
also be unpatched.  In theory, we could go ahead and patch the child
then.  In fact, that's what I did in v1.9.

But in v1.9 discussions it was pointed out that someday maybe the
ret_from_fork stuff will get cleaned up and instead the child stack will
be copied from the parent.  In that case the child should inherit its
parent's patched state.  So we decided to make it more future-proof by
having the child inherit the parent's patched state.

So, having said all that, I'm really not sure what the best approach is
for print_context_stack_reliable().  Right now I'm thinking I'll change
it back to return -EINVAL for a newly forked task, so it'll be more
future-proof: better to have a false positive than a false negative.
Either way it will probably need to be changed again if the
ret_from_fork code gets cleaned up.

> > +
> > spin_lock(>sighand->siglock);
> >  
> > /*
> 
> [...]
> 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 000..92819bb
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > +/*
> > + * This function can be called in the middle of an existing transition to
> > + * reverse the direction of the target patch state.  This can be done to
> > + * effectively cancel an existing enable or disable operation if there are 
> > any
> > + * tasks which are stuck in the initial patch state.
> > + */
> > +void klp_reverse_transition(void)
> > +{
> > +   struct klp_patch *patch = klp_transition_patch;
> > +
> > +   klp_target_state = !klp_target_state;
> > +
> > +   /*
> > +* Ensure that if another CPU goes through the syscall barrier, sees
> > +* the TIF_PATCH_PENDING writes in klp_start_transition(), and calls
> > +* klp_patch_task(), it also sees the above write to the target state.
> > +* Otherwise it can put the task in the wrong universe.
> > +*/
> > +   smp_wmb();
> > +
> > +   klp_start_transition();
> > +   klp_try_complete_transition();
> 
> It is a bit strange that we keep the work scheduled. It might be
> better to use
> 
>mod_delayed_work(system_wq, _work, 0);

True, I think that would be better.

> Which triggers more ideas from the nitpicking deparment:
> 
> I would move the work definition from core.c to transition.c because
> it is closely related to klp_try_complete_transition();

That could be good, but there's a slight problem: klp_work_fn() requires
klp_mutex, which is static to core.c.  It's kind of nice to keep the use
of the mutex in core.c only.

> When on it. I would make it more clear that the work is related
> to transition.

How would you recommend doing that?  How about:

- rename "klp_work" -> "klp_transition_work"
- rename "klp_work_fn" -> "klp_transition_work_fn" 

?

> Also I would call 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.

> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -76,6 +76,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned long 
> clone_flags,
>   p->parent_exec_id = current->self_exec_id;
>   }
>  
> + klp_copy_process(p);

I am in doubts here. We copy the state from the parent here. It means
that the new process might still need to be converted. But at the same
point print_context_stack_reliable() returns zero without printing
any stack trace when TIF_FORK flag is set. It means that a freshly
forked task might get be converted immediately. I seems that boot
operations are always done when copy_process() is called. But
they are contradicting each other.

I guess that print_context_stack_reliable() should either return
-EINVAL when TIF_FORK is set. Or it should try to print the
stack of the newly forked task.

Or do I miss something, please?

> +
>   spin_lock(>sighand->siglock);
>  
>   /*

[...]

> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 000..92819bb
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> +/*
> + * This function can be called in the middle of an existing transition to
> + * reverse the direction of the target patch state.  This can be done to
> + * effectively cancel an existing enable or disable operation if there are 
> any
> + * tasks which are stuck in the initial patch state.
> + */
> +void klp_reverse_transition(void)
> +{
> + struct klp_patch *patch = klp_transition_patch;
> +
> + klp_target_state = !klp_target_state;
> +
> + /*
> +  * Ensure that if another CPU goes through the syscall barrier, sees
> +  * the TIF_PATCH_PENDING writes in klp_start_transition(), and calls
> +  * klp_patch_task(), it also sees the above write to the target state.
> +  * Otherwise it can put the task in the wrong universe.
> +  */
> + smp_wmb();
> +
> + klp_start_transition();
> + klp_try_complete_transition();

It is a bit strange that we keep the work scheduled. It might be
better to use

   mod_delayed_work(system_wq, _work, 0);

Which triggers more ideas from the nitpicking deparment:

I would move the work definition from core.c to transition.c because
it is closely related to klp_try_complete_transition();

When on it. I would make it more clear that the work is related
to transition. Also I would call queue_delayed_work() directly
instead of adding the klp_schedule_work() wrapper. The delay
might be defined using a constant, e.g.

#define KLP_TRANSITION_DELAY round_jiffies_relative(HZ)

queue_delayed_work(system_wq, _transition_work, KLP_TRANSITION_DELAY);

Finally, the following is always called right after
klp_start_transition(), so I would call it from there.

if (!klp_try_complete_transition())
klp_schedule_work();


> +
> + patch->enabled = !patch->enabled;
> +}
> +

It is really great work! I am checking this patch from left, right, top,
and even bottom and all seems to work well together.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev