On Tue, Jun 21, 2011 at 12:10 PM, <[email protected]> wrote: > Hi there, > > there is some support for multi-threaded processes in ltrace, but so far > it was incomplete. Everything works if the threads stay away of each > other, but as soon as they end up in the same area of code, it all > breaks. > > The problem is due to return breakpoints. When two threads take the > same function call, ltrace places two breakpoints over each other, > because it has no concept of shared address space. There are many > problems with this, and ltrace ends up seeing unexpected breakpoints, > and SIGSEGVing the process. > > The way to solve this, ltrace must first learn that there is any such > thing as task and thread group. Then it needs to store all the > breakpoints in the structure shared by all the tasks in the thread > group. To prevent races, before any breakpoint is temporarily disabled > (for re-enablement, namely continue_after_breakpoint), all tasks in the > thread group must be stopped. > > There is a code on the branch pmachata/threads that implements this. > Here's what the branch roughly does: > > - Process * leader; was added to struct Process. This points to a > process that is a thread group leader of a thread group that this > process is a member of. > > - proper interfaces were added for handling the set of processes and > their tasks (add_process, remove_process, each_process, each_task). > The iteration interfaces (each_*) use call-backs to do the real work. > > - interfaces were added for accessing the information about the > processes (process_leader, process_tasks, process_stopped, > process_status). > > - a new interface task_kill is a wrapper for the SYS_tkill system call > that is not wrapped by glibc. We use this to stop or continue a > single task. > > - when we need to stop tasks for breakpoint re-enablement, we send > SIGSTOP. This SIGSTOP has to be caught and sunk. While we wait for > the signal to be delivered, we pump all incoming events to an event > queue that was created for this purpose (each_qd_event, enque_event). > The interface next_event takes events from the queue if there are > any. > > - all this, the event interception, sinking of SIGSTOP etc., is very > platform specific. So thread group now can have a registered event > handler (install_event_handler, destroy_event_handler). If present, > this is called at the beginning of handle_event. The registered > handler can do whatever it wishes with the event in question, and > return either NULL (if the event was handled or sunk) or the original > (possibly modified) event that is then handled by the default handler > as usual. > > - there have also been some small cleanups. > > For some reason, attaching to running multi-threaded task doesn't work > (this was one of the first things that I fixed, but apparently it got > broken in the meantime), so that's what I'll be doing next. > > Then comes cleaning it all up and making the git history of my branch a > bit less messy, at which point I'd ask some of you to review the (rather > large) patch. I also need to verify that it works on non-x86 > architectures, so far I was only working with x86_64. I'll keep you > posted as my work progresses.
Sounds great, I look forward to taking a look at the code when it is ready. _______________________________________________ Ltrace-devel mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/ltrace-devel
