Do you have a way to turn off one of the sockets on "c" (2 x E5540) and get the 
numbers with HT (8 processors) and without HT (4 processors)? It would also be 
interesting to see "c" with HT turned off.

Certainly it seems to me that idlehands needs to be fixed, your bit array 
"active.schedwait" is one way.

In my experience of bringing up the Alliant FX/8 mini-supercomputer which had 8 
(mostly CPU) + 12 (mostly I/O) = 20 processors there were a bunch of details 
that had to addressed as we went from 1 to 20 processors. There were even some 
issues with the system timing (user, sys, real) itself being messed up, but I 
can't remember the details.

I do remember one customer that had a billing system complain that they had 
their own customers complaining that in high I/O environments they were getting 
charged for interrupts (included in sys at the time) they didn't incur, which 
was true. I think we fixed that one, by having an idle process per CPU and 
charging each interrupt to the processor idle process.

I mention it because we got a lot of mileage out of the decision to give every 
processor an idle process. Our scheduler was set up to only run that process if 
there were no other processes available for that processor. When the idle 
process did run it did a few things and then called halt. There is some more to 
the story and if anyone is interested, let me know and I'll either post a 
follow up or I can respond in private.

We used to have a saying at Alliant: "Data drives out speculation".

leb


At 7:26 PM -0400 6/18/10, erik quanstrom wrote:
>note the extreme system time on the 16 processor machine
>
>a      2 * Intel(R) Xeon(R) CPU            5120  @ 1.86GHz
>b      4 * Intel(R) Xeon(R) CPU           E5630  @ 2.53GHz
>c      16* Intel(R) Xeon(R) CPU           E5540  @ 2.53GHz

-- 
[email protected]


Reply via email to