(change subject) So. We are going to make a separate, dynamically allocated structure for tracees. Say, we add "struct ptrace_child *ptrace_child" into task_struct.
attach/attachme do kmalloc() and use task_lock() to avoid races. (with the current locking write_lock(tasklist) alone is enough). Eventually, we should move all ptrace-related fields from task_struct to ptrace_child (ptrace_message/last_siginfo/etc), but let's forget about them for now. So, for the start we have (of course, I use the random naming). struct ptrace_child { // was task_struct->ptrace unsigned int ptrace_flags; // moved from task_struct struct list_entry ptrace_entry; // new, points back to the child. // at least, the tracer needs it when it does // list_for_each(current->ptraced). struct task_struct *self; The first question, should we free it after detach? I think yes. Otherwise, for example, instead of "if (p->ptrace)" we should do "if (p->ptrace_child && p->ptrace_child->ptrace_flags)", not good. So ->ptrace_child != NULL means its traced (and we can even kill PT_PTRACED). Looks like we don't even need task_lock() to reset ->ptrace_child... But. Until we rework the locking (afaics, ->ptrace_mutex should be held throughout the while sys_ptrace() call), even ptracer can't safely access child->ptrace_child->ptrace_flags. Once sys_ptrace() drops tasklist, child's sub-thread can do exec, and de_thread() can untrace the child and free ->ptrace_child. That is why I think ->ptrace_mutex should come first, but it is very possible I missed something... Of course, we can add a lot of task_lock's, but this a) complicates the code and b) ptrace_release_task() should take task_lock() too under write_lock(tasklist). And of course, we should also move task_struct->parent into ptrace_child, // was task_struct->parent struct task_struct *ptrace_parent; And now we have the new minor problem. Say, tracehook_tracer_task(). How can it safely read ->ptrace_child->ptrace_parent ? Looks like we should make ->ptrace_child rcu-safe. Actually, I'd prefer to rely on task_lock(), but again, afaics this means we should do the locking first. Anything else I missed? Oleg.