Richard,
On Fri, May 04, 2007 at 01:21:17PM -0400, Richard C Bilson wrote: > > From [EMAIL PROTECTED] Wed May 2 12:33:44 2007 > > > > I spent some more time on this today. I am still under the impression > > that there is a race between F_SETOWN/F_SETSIG and a counter overflow > > with notification. > > > > Is there a way you could modifiy your program such that no thread > > call pfm_start() before all setups are done? Then I suspect the > > problem will disappear. > > I hate to be the bearer of bad news, but I have done as you suggest and > the problem remains. If you'd care to see my current test code it is at > http://plg.uwaterloo.ca/~rcbilson/sigio.cc > You've already ruined my election week-end ;-< Anyway, I have modified your program to instrument the bad condition. In particular I wanted to know what is wrong: the si_fd or the thread receving the SIGIO. # sigio 3 (3 threads created) created th=1082132832 i=0 created th=1090525536 i=1 created th=1098918240 i=2 [FIXED_CTRL(pmc2)=0xa0 pmi0=1 en0=0x0 pmi1=1 en1=0x2 pmi2=1 en2=0x0] UNHALTED_CORE_CYCLES [FIXED_CTR1(pmd1)] [GLOBAL_CTRL(pmc0)=0x200000000 en0=0 en1=0 fen0=0 fen1=1 fen2=0] th=1082132832 id=0 fd=3 th=1082132832 fd=3 start tid=7968 pid=7967 [FIXED_CTRL(pmc2)=0xa0 pmi0=1 en0=0x0 pmi1=1 en1=0x2 pmi2=1 en2=0x0] UNHALTED_CORE_CYCLES [FIXED_CTR1(pmd1)] [GLOBAL_CTRL(pmc0)=0x200000000 en0=0 en1=0 fen0=0 fen1=1 fen2=0] th=1098918240 id=2 fd=4 th=1098918240 fd=4 start tid=7970 pid=7967 [FIXED_CTRL(pmc2)=0xa0 pmi0=1 en0=0x0 pmi1=1 en1=0x2 pmi2=1 en2=0x0] UNHALTED_CORE_CYCLES [FIXED_CTR1(pmd1)] [GLOBAL_CTRL(pmc0)=0x200000000 en0=0 en1=0 fen0=0 fen1=1 fen2=0] th=1090525536 id=1 fd=5 th=1090525536 fd=5 start tid=7969 pid=7967 Runtime error (UNIX pid:7967) si_fd=4 fd=3 th=1082132832 si_fd=4 which is a fd for a thread that has started monitoring. Yet the thread owner of fd=3 has also started. So not obvious which is wrong. Yet I would tend to think it's the thread. What do you se in your setup? I looked at the kernel code and it is not clear what is wrong (see fs/fcntl.c). Somehow, it seems like the kernel picks the wrong thread. What worries me is th following loop in send_sigio(): do_each_pid_task(pid, type, p) { send_sigio_to_task(p, fown, fd, band); } while_each_pid_task(pid, type, p); I don't quite understand what is going on with struct pid *. But this could potentially send to multiple threads or the wrong thread. More investigation needed. It may be that there is no way to target the SIGIO to a particular thread for each descriptor. -- -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
