Hi, Last year or so, we had a discussion about a signal notification problem with perfmon and self-monitoring multi-threaded sampling programs.
Perfmon notification is using signal. This is the only mechanism possible for asynchronous notifications. For user-level sampling (no kernel buffer), you get a notification for each counter overflow. For kernel-buffer sampling, you get a sample whenever the buffer fills up (default format). Typically with self sampling you want the signal to be delivered to the thread that caused the overflow. It is not only convenient but it may be required because a program may want to modify the thread's state when it gets a sample. POSIX does not mandate that asynchronous signals be delivered to the thread from which they originate. The signal can be delivered to any thread within the process. Synchronous signals are delivered to the thread that caused the event, e.g., SIGFPE, SIGTRAP. Perfmon uses the standard POSIX mechanism to request asynchronous notifications on a file descriptor: flags = fcntl(fd, F_GETFL, 0); fcntl(fd, F_SETFL, flags | O_ASYNC); fcntl(fd, F_SETOWN, getpid()); By default, the SIGIO signal is used. This can be overridden using the (non-standard) F_SETSIG command to fcntl(). SIGIO is an asynchronous signal. The Linux kernel maintains two signal pending queues: - one queue private to each thread - one queue shared by all threads of a process What determines which queue to use is where you come from. If you get a floating point exception, the signal is pended to the private queue. If you come in for a file descriptor asynchronous notification, the signal is pended to the shared queue. It should be noted that changing the signal via F_SETSIG, does not alter this behavior. Any thread can pull from the shared queue by definition. So how come that with perfmon, the signal seem to be delivered to the right thread? Once the kernel pends the signal, it needs to select a thread to wake-up or signal. That thread will have a TIF flag set and it will go pull the signal form the queue. Signals are first pulled from the private queue, then the shared queue, i.e., private queue has higher priority (which is what you want). If possible, the kernel first tries to use the thread in which the event occurred. If not possible, it iterates other the other threads. A thread is selected if: 1 - it does not have the signal blocked 2 - it is not exiting 3 - does not have ANY signal pending Based on the criteria above, the reason why it works most of the time for perfmon is because when you get the overflow notification, you do not have another perfmon-related signal pending. But you run into the problem if the monitored process is using signals. For instance, if the program is using SIGALRM, then SIGIO may be delivered to the wrong thread. If your program does not use any signal then, you may be okay (assuming libpthread does not use signals internally). I do not have a good fix for this now but should have one by next week. Hopefully this clarifies all questions about this problem.
------------------------------------------------------------------------------
_______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel