[perf-discuss] Re: RE: Re: Puzzling scheduler behavior

Eric C. Saxe Thu, 01 Sep 2005 12:26:55 -0700

>   Let me followup with a question. In this application, processes have
> not only their "own" memory, ie heap, stack program text and data, etc,
> but they also share a moderately large (~ 2-5GB today) amount of memory
> in the form of mmap'd files. From Sherry Moore's previous posts, I'm
> assuming that at startup time that would actually be all allocated in
> one board. Since I'm contemplating moving processes onto psrsets off
> that board, would it be plausible to assume that I might get slightly
> better net throughput if I could somehow spread that across all the
> boards? I know its speculation of the highest order, so maybe my real
> question is whether that's even worth testing.


As Jonathan said, the kernel will try to spread out the memory allocated for 
large shared memory segments for the reasons you mentioned. The kernel will 
interpret the 2-5GB mapping as shared if the MAP_SHARED flag was specified in 
the mmap() call. Because these are files you are mapping though, it's possible 
that the pages backing the mapping are already in the page cache (if they've 
been mapped before). For any new pages that need to be created to back the 
mapping, those will be allocated in a random fashion, but for those that 
already reside in the page cache, they are where they are. :)

You can use the extended pmap tool (on the NUMA observability page) to observe 
the page placement, and to see where the pages actually are. If they are 
suboptimally placed, you can using the pmadvise tool to migrate (spread out) 
the pages in the segment to see if this improves throughput. (On the other side 
of the coin you could experiement with migrating all the pages to the home 
lgroup to see if the lower latency helps). My guess is that for this segment, 
random placement actually is best.

>   In any case, I'd love to turn the knob you mention and I'll look on
> the performance community page and see what kind of trouble I can get
> into. If there are any particular items you think I should check out,
> guidance is welcome.

Great. You're asking such great questions I wonder if you are a shill? :)

Here's the knob I was thinking of:
lgrp_expand_proc_thresh (default: ~65516 or 0xffec): Controls how loaded an 
lgroup must be before we'll consider homing a process's threads to another 
lgroup. Tune this lower to have it spread your process's threads out more.

It might also be interesting to try using plgrp(1) to try homing your 
application's threads to the root lgroup. (The root is the set of resources at 
the system wide level of locality). Homing to the root effectively disables 
board level affinity and will take away the thread's tendendy to try to migrate 
to a particular board.

Thanks,
-Eric
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

[perf-discuss] Re: RE: Re: Puzzling scheduler behavior

Reply via email to