Jim Mauro has provided an excellent starting point. Keep in mind that kernel threads will show up as pid 0 so you may be seeing a kernel thread Causing the activity.
Jim L ----------Original Message---------- From: Jim Leonard <trix...@oldskool.org> Sent: Tue, September 22, 2009 11:31 AM To: dtrace-discuss@opensolaris.org Subject: [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided) We have a 16-core x86 system that, at seemingly random intervals, will completely stop responding for several seconds. Running an mpstat 1 showed something horrifiying: CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 0 0 1004691 397 170 0 0 0 5 0 0 0 100 0 0 (rest of CPUs omitted) That's over a million cross-calls a second. Seeing them on CPU0 made me nervous that they were kernel-related, so I wrote a dtrace to print out xcalls per second aggregated by PID to see if a specific process was the culprit. Here's the output during another random system outage: 2009 Sep 22 12:51:49, load average: 5.90, 5.35, 5.39 xcalls: 637511 PID XCALLCOUNT 6164 15 6165 15 28339 26 0 637455 PID 0 is "sched" (aka the kernel). At this point I'm completely stumped as to what could be causing this. Any hints or ideas? -- This message posted from opensolaris.org _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org