On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

[...]

> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>       ops = container_of(fops, struct klp_ops, fops);
>  
>       rcu_read_lock();
> +
>       func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
>                                     stack_node);
> -     rcu_read_unlock();
>  
>       if (WARN_ON_ONCE(!func))
> -             return;
> +             goto unlock;
> +
> +     if (unlikely(func->transition)) {
> +             /* corresponding smp_wmb() is in klp_init_transition() */
> +             smp_rmb();
> +
> +             if (current->klp_universe == KLP_UNIVERSE_OLD) {
> +                     /*
> +                      * Use the previously patched version of the function.
> +                      * If no previous patches exist, use the original
> +                      * function.
> +                      */
> +                     func = list_entry_rcu(func->stack_node.next,
> +                                           struct klp_func, stack_node);
> +
> +                     if (&func->stack_node == &ops->func_stack)
> +                             goto unlock;
> +             }
> +     }
>  
>       klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> +     rcu_read_unlock();
>  }

I decided to understand the code more before answering the email about the 
race and found another problem. I think.

Imagine we patched some function foo() with foo_1() from patch_1 and now 
we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch 
calls klp_init_transition which sets klp_universe for all processes to 
KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1). 
Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo(). 
BUT what if somebody calls foo() right between klp_init_transition and 
the loop in __klp_enable_patch? The ftrace handler first returns the 
first entry in the list which is foo_1() (foo_2() is still not present), 
then it checks for func->transition. It is 1. It checks for 
current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is 
retrieved. There is no such and therefore foo() is called. This is 
obviously wrong because foo_1() was expected.

Everything would work fine if one would call foo() before 
klp_start_transition and after the loop in __klp_enable_patch. The 
solution might be to move the setting of func->transition to 
klp_start_transition, but this could break something different. I don't 
know yet.

Am I wrong?

Miroslav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to