Hi David,

I have a theory.  But before I present it, I would like a bit more
information.

    - Output of "/usr/sbin/prtdiag"
    - If it is a Sun Fire, the output of "cfgadm -av"

Thanks,
Sherry

On Wed, Aug 31, 2005 at 02:13:47PM -0700, David McDaniel wrote:
> Our application consists of multiple cooperating multithreaded processes. The 
> application is both latency and throughput sensitive. Since it originated 
> long ago, several artifacts are less than optimal, but thats the way it is 
> for awhile longer. Anyway, I digress. Most threads run in the TS class with 
> boosted "nice" values so as to limit the possible interference from the the 
> occasional background task; the exceptions are a few very lightweight, 
> infrequent, but urgent ones that run in the RT class. Additionally, hires 
> tick is set. As a result, the default rechoose_interval period is 3ms rather 
> than the normal 30ms.
>   The curious thing is that on a 12 cpu system I can observe that some cpus 
> are much busier than others, and latency as observed via prstat -Lm is higher 
> than expected on a lightly loaded system. I presume this is an artifact of 
> threads queueing up for rechoose_interval on the last cpu they ran on instead 
> of migrating. This seems to be born out by the fact that I can use psrset to 
> create a set containing one of the otherwise idle cpus, bind a process to it, 
> then delete the processor set and see that the previously bound process 
> appears to stick on the previously idle cpu. OK, so far, but the other 
> processes still seem to be contending for busy cpus, which is inoptimal for 
> our application.
>   Now comes the real puzzler, to me at least. I set rechoose_interval=0 in 
> /etc/system, reboot, take it from the top. I though this would result in the 
> load being spread out over time as threads migrated to and then stuck to 
> uncontended cpus, but thats not what I see. Here is mpstat snapshot:
> CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
>   0    14    0   211  2976 1957 2245   84  375  156    0 20874   17   9   0  
> 74
>   1     0     0   149    98    2 2612   89  583   73    0 19032   16  11   0  
> 73
>   2   12     0   184    86    6 2523   76  589   76    0 17215   13   9   0  
> 77
>   3     0     0    96   650  581 2387   64  530   85    0 13249   11   7   0  
> 82
>   8   56     0    11     6    1  581    2  227   25    0  1401    2   2   0  
> 97
>   9     0     0     6     4    1  550    0  111    8    0   398    1   1   0  
> 98
>  10    5     0     8    28   25  546    0   44   14    0   165    0   1   0  
> 99
>  11    0     0   16   390  388  219    0   23   18    0    75    0   1   0  99
>  16  52     0   13    10    7  223    1   22    5    0   212    0   1   0  99
>  17    0     0    5     4    1  322    0   34    5    0   525    0   1   0  99
>  18    0     0   15     5    1  319    1   86   11    0  1558    1   1   0  98
>  19    1     0   50     8    1  552    4  192   22    0  4406    4   2   0  94
> 
>   Any thoughts?
> This message posted from opensolaris.org
> _______________________________________________
> perf-discuss mailing list
> perf-discuss@opensolaris.org

-- 
[EMAIL PROTECTED], Solaris Kernel Development, http://blogs.sun.com/sherrym
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to