I'd like to ask you to clarify what utrace->stopped means... My understanding is: if we see ->stopped == true under utrace->lock, then the target can do nothing "interesting" from the utrace's pov. The target should take utrace->lock at least once. Either in finish_utrace_stop(), or, if ->stopped was set by do_signal_stop() path, the target will call tracehook_get_signal()->utrace_get_signal(). So we can assume the target is "quiescent" and we can do, for example, UTRACE_DETACH safely.
Is this correct? But utrace_report_jctl() doesn't look right to me, spin_lock(&utrace->lock); utrace->stopped = 0; utrace->report = 0; spin_unlock(&utrace->lock); I must admit, I dont't understand the comment above, but obviously this is right, we should clear ->stopped. If nothing else, REPORT()->start_report() won't be happy if ->stopped. But ->stopped can be restored right after we clear it! Yes, utrace_do_stop() and utrace_set_events() set ->stopped == 1 only if ->utrace_flags has no JCTL, and since we are here we must have JCTL. But, if we enter utrace_report_jctl() with ->stopped == 1, JCTL can be already removed from ->utrace_flags, exactly because ->stopped was true. No? This leads to another minor question, how it is possible to enter enter utrace_report_jctl() with ->stopped == 1 ? I think the only possibility it was previously set by another call to utrace_report_jctl(), see below. REPORT(task, utrace, &report, UTRACE_EVENT(JCTL), report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED, what); if (was_stopped && !task_is_stopped(task)) { /* * The event report hooks could have blocked, though * it should have been briefly. Make sure we're in * TASK_STOPPED state again to block properly, unless * we've just come back out of job control stop. */ Yes. Even a plain kmalloc() can change ->state to TASK_RUNNING, spin_lock_irq(&task->sighand->siglock); if (task->signal->flags & SIGNAL_STOP_STOPPED) __set_current_state(TASK_STOPPED); SIGNAL_STOP_STOPPED is not reliable, it is possible that the group stop in progress and it is not finished yet. But ->group_stop_count is not reliable too. It it possible that we recieved SIGCONT and then another SIGSTOP. If another thread has already dequeued this SIGSTOP and initiated the new group stop, we can't just set TASK_STOPPED, we must participate in the ->group_stop_count accounting. if (task_is_stopped(current)) { /* * While in TASK_STOPPED, we can be considered safely * stopped by utrace_do_stop() only once we set this. */ spin_lock(&utrace->lock); utrace->stopped = 1; spin_unlock(&utrace->lock); I think this is correct, but it is not easy to understand. SIGCONT may come right after the task_is_stopped() check, so this _looks_ racy. But, nobody should clear ->utrace_flags without calling utrace_wakeup() which clears ->stopped too. This means that the target can't escape from get_signal_to_deliver() with the ->stopped == 1. And in fact, we could check was_stopped instead of task_is_stopped(). Is my understanding correct? But! can't we miss utrace_wakeup() ? Let's suppose the debugger D attaches the single engine E to the target T. D does utrace_set_events(JCTL). T calls do_signal_stop(), tracehook_notify_jctl() sees JCTL and starts utrace_report_jctl(). D does utrace_set_events(events => 0), this clears E->flags. T calls REPORT(), nobody needs needs JCTL, finish_report() sees !->takers and calls utrace_reset(). It sets ->utrace_flags = 0. T checks task_is_stopped(), sets ->stopped = 1. Now, when T is woken by SIGCONT, it returns to user-space bypassing all utrace hooks, and runs with ->stopped == 1. This doesn't look right. Say, D can do utrace_set_events(ANY) and then T hits start_report()->BUG_ON(utrace->stopped). Could you clarify? Oleg.