On Tue, Apr 22, 2008 at 10:25 PM, Aaron Knister <[EMAIL PROTECTED]> wrote: > Did you ever find a resolution?
The core mapping change had to do with "noacpi" on my kernel command line (ACPI, not APIC). It seems ACPI has a lot to do with core mapping (not just power). It also effected interrupt distribution/balancing (/proc/interrupts was showing all timer interrupts handled by CPU0, for example). ACPI had to both be defined in the kernel and not disabled on the kernel command line. This did not solve the 2x to 10x performance issue with 1.6.4.3, but I don't have that problem with a manually patched RHEL 2.6.9-67.0.4 kernel in 1.6.4.2. My best guess is: I omitted the Quadrics patches from my manual patching... maybe they have something to do with the slowdown. I have a list of system calls that I believe are associated with the slowdown... but in looking at CPU counters, the application takes no more CPU time, the walltime just increases... like the kernel is forgetting to schedule the app. More on this at: https://bugzilla.lustre.org/show_bug.cgi?id=15478 > And out of curiosity, how did you determine > that the core to logical processor allocation had changed? I'm trying to > figure it out in my own set up. A quick glance at /proc/cpuinfo shows the difference. The "correct" case looks like: # cat /proc/cpuinfo | grep -e processor -e "core id" processor : 0 core id : 0 processor : 1 core id : 2 processor : 2 core id : 4 processor : 3 core id : 6 processor : 4 core id : 1 processor : 5 core id : 3 processor : 6 core id : 5 processor : 7 core id : 7 The "incorrect" mapping shows "processor" == "core id" (as it does above for cpu's 0 and 7... but for all processors). I work w/ benchmark clusters (they are only used for benchmarking and tuning applications), and many immediately saw the differences in codes they'd been benchmarking. Some folks run on fewer cores than are available per node (i.e. to not share cache between MPI processes, or, in some cases of multithreaded apps, they do want to share cache), and the optimal MPI CPU mapping for an 8 core system (at least for this vendors CPUs) puts logical cores 0 and 1 on a different socket, 2 and 3 share sockets w/ 0 and 1, but different L2 caches. With ACPI disabled, the logical and physical mapping were the same. In those cases where the MPI does process pinning, the apps were (mostly) okay... but other apps don't specifically pin, and, where logical==physical, all four processes were running on the same socket and their performance went down. You could argue that apps should pin if needed, but also argue that it's nice to have a CPU mapping that helps apps that don't pin. Furthermore, others noticed that even w/ proper processor pinning and using physical processors 0, 2, 4, 6 there results were worse than using 1,3,5,7... this turned out again to be ACPI related where interrupts weren't being balanced across the CPUs (look at the first line of "timer" interrupts in /proc/interrupts and see if all go to CPU0... that imbalance will effect performance on MPI apps that use all cores too). Hope that helps. Chris > > -Aaron > > > > On Apr 10, 2008, at 2:13 PM, Chris Worley wrote: > > > > > > > > > > On Thu, Apr 10, 2008 at 12:05 PM, Johann Lombardi <[EMAIL PROTECTED]> wrote: > > > > > Chris, > > > > > > > > > On Sat, Apr 05, 2008 at 06:11:32PM -0600, Chris Worley wrote: > > > > > > > I was running RHEL's 2.6.9-67.0.4 kernel w/o Lustre patches, and the > > > > > > > > > > What is the CPU architecture? x86_64 or IA64? > > > > > > > x86_64. > > > > > > > > > > > > > > > > > core to logical processor allocation was (as shown by /proc/cpuinfo): > > > > > > > > > > > > ============= ============= Socket > > > > > > > > ====== ====== ====== ====== L2 cache domain > > > > > > > > 0 4 1 5 2 6 3 7 logical > processor > > > > > > > > > > > > After installing the Lustre version of the kernel, the allocation is: > > > > > > > > ============= ============= Socket > > > > > > > > ====== ====== ====== ====== L2 cache domain > > > > > > > > 0 1 2 3 4 5 6 7 logical > processor > > > > > > > > > > Hard to believe that one of our patches could cause this. > > > Have you compared the kernel config files? > > > > > > > This is the default from RedHat vs. the default from > > downloads.lustre.org. We didn't rebuild either from scratch. > > > > Chris > > > > > > > > Cheers, > > > Johann > > > > > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > Aaron Knister > Associate Systems Analyst > Center for Ocean-Land-Atmosphere Studies > > (301) 595-7000 > [EMAIL PROTECTED] > > > > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
