On Wed, 12 Apr 2017 08:18:17 -0700
"Paul E. McKenney" <[email protected]> wrote:


> > Well the trampolines pretty much can, but they are removed before
> > calling synchronize_rcu_tasks(), and nothing can enter the trampoline
> > when that is called.  
> 
> Color me confused...
> 
> So you can have an arbitrary function call within a trampoline?

Sorta.

When you do register_ftrace_function(ops), where ops has ops->func that
points to a function you want to have called when a function is traced,
the following happens (if there's no other ops registered). Let's use
an example where ops is filtered on just the schedule() function call:


 <schedule>:
    call trampoline ---+
    [..]               |
                       +--> <trampoline>:
                              push regs
                              call ops->func
                              pop regs
                              ret

But that ops->func() must be very limited in what it can do. Although,
it may actually call an rcu_read_lock()! But if that's the case, it
must either check if rcu is watching (which perf does), or enable rcu
via the rcu_irq_enter() with a check on rcu_irq_enter_disabled(), which
my stack tracer does.

Now this can be called even from NMI context! Thus what ops->func does
must be aware of that. The stack tracer func has an:

  if (in_nmi())
        return;

Because it needs to grab spin locks.

But one thing an op->func() is never allowed to do, is to call
schedule() directly, or even a cond_resched(). It may be preempted if
preemption was enabled when the trampoline was hit, but it must not
assume that it can do a voluntary schedule. That would break the
rcu_tasks as well if it did.


> 
> If not, agreed, no problem.  Otherwise, it seems like we have a big
> problem remaining.  Unless the functions called from a trampoline are
> guaranteed never to do a context switch.

Well, they can be preempted, but they should never do a voluntary
schedule. If they did, that would be bad.

> 
> So what exactly is the trampoline code allowed to do?  ;-)

Well, it must be able to work in an NMI context, or bail otherwise. And
it should never schedule on its own.

>
> My problem is that I have no idea what can and cannot be included in
> trampoline code.  In absence of that information, my RCU-honed reflexes
> jump immediately to the worst case that I can think of.  ;-)
>

Lets just say that it can't voluntarily sleep. Would that be good
enough? If someday in the future I decide to let it do so, I would add
a flag and force that ops not to be able to use a dynamic trampoline.

Currently, without the synchronize_rcu_tasks(), when a dynamic ops is
registered, the functions will point to a non dynamic trampoline. That
is, one that is never freed. It simply does:

        preempt_disable_notrace();

        do_for_each_ftrace_op(op, ftrace_ops_list) {
                /*
                 * Check the following for each ops before calling their func:
                 *  if RCU flag is set, then rcu_is_watching() must be true
                 *  if PER_CPU is set, then ftrace_function_local_disable()
                 *                          must be false
                 *  Otherwise test if the ip matches the ops filter
                 *
                 * If any of the above fails then the op->func() is not 
executed.
                 */
                if ((!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching()) &&
                    (!(op->flags & FTRACE_OPS_FL_PER_CPU) ||
                     !ftrace_function_local_disabled(op)) &&
                    ftrace_ops_test(op, ip, regs)) {
                    
                        if (FTRACE_WARN_ON(!op->func)) {
                                pr_warn("op=%p %pS\n", op, op);
                                goto out;
                        }
                        op->func(ip, parent_ip, op, regs);
                }
        } while_for_each_ftrace_op(op);
out:
        preempt_enable_notrace();

-- Steve

Reply via email to