On Thu 2025-02-27 10:47:33, Yafang Shao wrote:
> The tasklist_lock in the KLP transition might result high latency under
> specific workload. We can replace it with RCU.
> 
> After a new task is forked, its kernel stack is always set to empty[0].
> Therefore, we can init these new tasks to KLP_TRANSITION_IDLE state
> after they are forked. If these tasks are forked during the KLP
> transition but before klp_check_and_switch_task(), they can be safely
> skipped during klp_check_and_switch_task(). Additionally, if
> klp_ftrace_handler() is triggered right after forking, the task can
> determine which function to use based on the klp_target_state.
> 
> With the above change, we can safely convert the tasklist_lock to RCU.
> 
> Link: https://lore.kernel.org/all/20250213173253.ovivhuq2c5rmvkhj@jpoimboe/ 
> [0]
> Link: https://lore.kernel.org/all/20250214181206.xkvxohoc4ft26uhf@jpoimboe/ 
> [1]
> Suggested-by: Josh Poimboeuf <jpoim...@kernel.org>
> Signed-off-by: Yafang Shao <laoar.s...@gmail.com>
> ---
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -95,7 +95,13 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  
>               patch_state = current->patch_state;
>  
> -             WARN_ON_ONCE(patch_state == KLP_TRANSITION_IDLE);
> +             /* If the patch_state is KLP_TRANSITION_IDLE, it means the task
> +              * was forked after klp_init_transition(). In this case, the
> +              * newly forked task can determine which function to use based
> +              * on the klp_target_state.
> +              */
> +             if (patch_state == KLP_TRANSITION_IDLE)
> +                     patch_state = klp_target_state;
>  

CPU0                            CPU1

task_0 (forked process):

funcA
  klp_ftrace_handler()
    if (patch_state == KLP_TRANSITION_IDLE)
       patch_state = klp_target_state
        # set to KLP_TRANSITION_PATCHED

  # redirected to klp_funcA()


                                echo 0 >/sys/kernel/livepatch/patch1/enabled

                                klp_reverse_transition()

                                  klp_target_state = !klp_target_state;
                                    # set to KLP_TRANSITION_UNPATCHED


   funcB
     if (patch_state == KLP_TRANSITION_IDLE)
       patch_state = klp_target_state
         # set to KLP_TRANSITION_UNPATCHED

   # staying in origianl funcB


BANG: livepatched and original function called at the same time

      => broken consistency model.

BTW: This is what I tried to warn you about at
     https://lore.kernel.org/r/z69wuhve2vnsr...@pathway.suse.cz

See below for more:

>               if (patch_state == KLP_TRANSITION_UNPATCHED) {
>                       /*
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index ba069459c101..af76defca67a 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -23,7 +23,7 @@ static DEFINE_PER_CPU(unsigned long[MAX_STACK_ENTRIES], 
> klp_stack_entries);
>  
>  struct klp_patch *klp_transition_patch;
>  
> -static int klp_target_state = KLP_TRANSITION_IDLE;
> +int klp_target_state = KLP_TRANSITION_IDLE;
>  
>  static unsigned int klp_signals_cnt;
>  
> @@ -294,6 +294,14 @@ static int klp_check_and_switch_task(struct task_struct 
> *task, void *arg)
>  {
>       int ret;
>  
> +     /*
> +      * If the patch_state remains KLP_TRANSITION_IDLE at this point, it
> +      * indicates that the task was forked after klp_init_transition().
> +      * In this case, it is safe to skip the task.
> +      */
> +     if (!test_tsk_thread_flag(task, TIF_PATCH_PENDING))
> +             return 0;
> +
>       if (task_curr(task) && task != current)
>               return -EBUSY;
>  
> @@ -466,11 +474,11 @@ void klp_try_complete_transition(void)
>        * Usually this will transition most (or all) of the tasks on a system
>        * unless the patch includes changes to a very common function.
>        */
> -     read_lock(&tasklist_lock);
> +     rcu_read_lock();
>       for_each_process_thread(g, task)
>               if (!klp_try_switch_task(task))
>                       complete = false;
> -     read_unlock(&tasklist_lock);
> +     rcu_read_unlock();
>  
>       /*
>        * Ditto for the idle "swapper" tasks.
> @@ -694,25 +702,10 @@ void klp_reverse_transition(void)
>  }
>  
>  /* Called from copy_process() during fork */
> -void klp_copy_process(struct task_struct *child)
> +void klp_init_process(struct task_struct *child)
>  {
> -
> -     /*
> -      * The parent process may have gone through a KLP transition since
> -      * the thread flag was copied in setup_thread_stack earlier. Bring
> -      * the task flag up to date with the parent here.
> -      *
> -      * The operation is serialized against all klp_*_transition()
> -      * operations by the tasklist_lock. The only exceptions are
> -      * klp_update_patch_state(current) and __klp_sched_try_switch(), but we
> -      * cannot race with them because we are current.
> -      */
> -     if (test_tsk_thread_flag(current, TIF_PATCH_PENDING))
> -             set_tsk_thread_flag(child, TIF_PATCH_PENDING);
> -     else
> -             clear_tsk_thread_flag(child, TIF_PATCH_PENDING);
> -
> -     child->patch_state = current->patch_state;
> +     clear_tsk_thread_flag(child, TIF_PATCH_PENDING);
> +     child->patch_state = KLP_TRANSITION_IDLE;

I thought that we might do:

        child->patch_state = klp_target_state;

to avoid the race in the klp_ftrace_handler() described above.

But the following might happen:

CPU0                            CPU1

klp_init_process(child)

  child->patch_state = KLP_TRANSITION_IDLE

                                klp_enable_patch()
                                  # setup ftrace handlers
                                  # initialize all processes
                                  # in the task list

  # add "child" into the task list

  schedule()

[running child now]

funcA()

  klp_ftrace_handler()

    child->patch_state = KLP_TRANSITION_IDLE

BANG: Same problem as with the original patch.


Hmm, the 2nd version of this patchset tried to do:

diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 90408500e5a3..5e523a3fbb3c 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -95,7 +95,12 @@ static void notrace klp_ftrace_handler(unsigned long ip,
 
                patch_state = current->patch_state;
 
-               WARN_ON_ONCE(patch_state == KLP_TRANSITION_IDLE);
+               /* If the patch_state is KLP_TRANSITION_IDLE, it indicates the
+                * task was forked after klp_init_transition(). For this newly
+                * forked task, it is safe to switch it to klp_target_state.
+                */
+               if (patch_state == KLP_TRANSITION_IDLE)
+                       current->patch_state = klp_target_state;
 
                if (patch_state == KLP_TRANSITION_UNPATCHED) {
                        /*

Note: It is broken. It sets current->patch_state but it later
      checks the local variable "patch_state" which is still
      KLP_TRANSITION_IDLE.

But is is safe when we fix it?

It might be as long as we run klp_synchronize_transition() between
updating the global @klp_target_state and the other operations.

Especially, we need to make sure that @klp_target_state always have
either KLP_TRANSITION_PATCHED or KLP_TRANSITION_UNPATCHED when
func->transition is set.

For this, we would need to add klp_synchronize_transition() into

  + klp_init_transition() between setting @klp_target_state
    and func->transition = true.

  + klp_complete_transition() also for KLP_TRANSITION_UNPATCHED way.
    It is currently called only for the PATCHED target state.

Will this be enough?

FACT: It is more complicated than it looked.
QUESTION: Is this worth the effort?

NOTE: The rcu_read_lock() does not solve the reported stall.
      We are spending time on it only because it looked nice and
      simple to you.

My opinion: I would personally prefer to focus on solving
        the stall and the use-after-free access in do_exit().

Best Regards,
Petr

Reply via email to