On Wed, Mar 23, 2011 at 05:24:12PM -0400, Thor Lancelot Simon wrote: > I have a new machine with 24 2Ghz Opteron cores. It has 32GB of RAM. > > Building with sources on a fast SSD ("preloaded" into the page > cache before the build using tar > /dev/null) and obj, dest, and rel > dirs on tmpfs, system builds are extraordinarily slow. The system > takes about 20 minutes to build a netbsd-5 based source tree > with -j24 -- about the same amount of time as an older 8-core Intel > based system running netbsd-5 requires with -j8. > > All cores spend well over 50% time in 'sys', even when all or almost > all are running cc1 processes. The kernel is amd64 -current GENERIC > from about 1 week ago -- no DIAGNOSTIC, DEBUG, KMEMSTATS, LOCKDEBUG, > etc. > > Does anyone have any idea what might be wrong here?
Try lockstat as suggested to see if something pathological is going on. In addition to showing lock contention problems it can often highlight a code path being hit too frequently for some reason. Have a look at the event counters and see if anything obviously ugly is going on. Also look for evidence of context switch storms. lockstat will show those for lock objects, but not for condition variables or homegrown stuff based off sleep queues. What sort of TLB shootdown rate does systat vmstat show? We have changes forthcoming that should help with this during build.sh. Don't get me wrong, the situation isn't particularly bad now on x86 for shootdowns but the forthcoming changes improve it quite a bit. I have a suspicion that SSD could cause issues for us because the buffer cache and other parts of the I/O system are not designed with near instantaneous request->response in mind, but that likely isn't at play here.. Do you have logging turned on for the SSD? We have some algorhythms in the scheduler and mutual exclusion code that aren't designed for large numbers of cores, but I think they should be OK with 24 CPUs. (While I'm rambling about this I think the SPINLOCK_BACKOFF stuff should have some sort of randomness to it perhaps based off curlwp and cpu_counter() otherwise things could proceeed in lockstep, although again probably not the issue here).