Re: [perf-discuss] Thread scheduler: exponentially slow in extreme conditions

Bart Smaalders Wed, 16 Apr 2008 11:18:31 -0700

Zeljko Vrba wrote:
> On Fri, Apr 11, 2008 at 11:23:10AM -0700, David Lutz wrote:
>> Take a look at your cache miss rates as you cross the 2^11 boundary.
>> My guess is that you will see something start to go through the roof.
>>
> cputrack has too much overhead when having a bunch of LWPs.  I did run
> cpustat though, in parallel with my experiment, with the following events
> on AMD64; the interval was 1 second:
> 
> pic0=DC_miss,pic1=DC_dtlb_L1_miss_L2_miss,pic2=IC_itlb_L1_miss_L2_miss
> 
> The number of data cache misses _does_ increase too, but what's worse is
> DTLB and ITLB misses.  Both roughly double with the number of threads, but
> the number of ITLB misses saturates at ~470k/s, and this saturation happens
> at the transition between 2048 and 4096 threads.
> 
> All threads are executing the same code which is rather small -- so I see
> no reason for this linear increase in the # of ITLB misses with the number
> of threads.  OK, more threads = more user<>kernel transitions.  Does Solaris
> make use of the global bit in page directories/tables?
>


What's the size of the relevant TLBs?  With text, stack and heap 
mappings for all
threads, this result isn't terribly surprising.

Solaris cannot use the global bits for user mappings since the locations 
of libraries,
etc, aren't fixed.  The kernel mapping are global if the CPU supports that.


- Bart




-- 
Bart Smaalders                  Solaris Kernel Performance
[EMAIL PROTECTED]               http://blogs.sun.com/barts
"You will contribute more with mercurial than with thunderbird."
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] Thread scheduler: exponentially slow in extreme conditions

Reply via email to