Hi David, I have a theory. But before I present it, I would like a bit more information.
- Output of "/usr/sbin/prtdiag" - If it is a Sun Fire, the output of "cfgadm -av" Thanks, Sherry On Wed, Aug 31, 2005 at 02:13:47PM -0700, David McDaniel wrote: > Our application consists of multiple cooperating multithreaded processes. The > application is both latency and throughput sensitive. Since it originated > long ago, several artifacts are less than optimal, but thats the way it is > for awhile longer. Anyway, I digress. Most threads run in the TS class with > boosted "nice" values so as to limit the possible interference from the the > occasional background task; the exceptions are a few very lightweight, > infrequent, but urgent ones that run in the RT class. Additionally, hires > tick is set. As a result, the default rechoose_interval period is 3ms rather > than the normal 30ms. > The curious thing is that on a 12 cpu system I can observe that some cpus > are much busier than others, and latency as observed via prstat -Lm is higher > than expected on a lightly loaded system. I presume this is an artifact of > threads queueing up for rechoose_interval on the last cpu they ran on instead > of migrating. This seems to be born out by the fact that I can use psrset to > create a set containing one of the otherwise idle cpus, bind a process to it, > then delete the processor set and see that the previously bound process > appears to stick on the previously idle cpu. OK, so far, but the other > processes still seem to be contending for busy cpus, which is inoptimal for > our application. > Now comes the real puzzler, to me at least. I set rechoose_interval=0 in > /etc/system, reboot, take it from the top. I though this would result in the > load being spread out over time as threads migrated to and then stuck to > uncontended cpus, but thats not what I see. Here is mpstat snapshot: > CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl > 0 14 0 211 2976 1957 2245 84 375 156 0 20874 17 9 0 > 74 > 1 0 0 149 98 2 2612 89 583 73 0 19032 16 11 0 > 73 > 2 12 0 184 86 6 2523 76 589 76 0 17215 13 9 0 > 77 > 3 0 0 96 650 581 2387 64 530 85 0 13249 11 7 0 > 82 > 8 56 0 11 6 1 581 2 227 25 0 1401 2 2 0 > 97 > 9 0 0 6 4 1 550 0 111 8 0 398 1 1 0 > 98 > 10 5 0 8 28 25 546 0 44 14 0 165 0 1 0 > 99 > 11 0 0 16 390 388 219 0 23 18 0 75 0 1 0 99 > 16 52 0 13 10 7 223 1 22 5 0 212 0 1 0 99 > 17 0 0 5 4 1 322 0 34 5 0 525 0 1 0 99 > 18 0 0 15 5 1 319 1 86 11 0 1558 1 1 0 98 > 19 1 0 50 8 1 552 4 192 22 0 4406 4 2 0 94 > > Any thoughts? > This message posted from opensolaris.org > _______________________________________________ > perf-discuss mailing list > perf-discuss@opensolaris.org -- [EMAIL PROTECTED], Solaris Kernel Development, http://blogs.sun.com/sherrym _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org