> Do you have a way to turn off one of the sockets on "c" (2 x E5540) and get 
> the numbers with HT (8 processors) and without HT (4 processors)? It would 
> also be interesting to see "c" with HT turned off.

here's the progression
4       4.41u 1.83s 4.06r               0. %ilock
8       4.47u 2.37s 3.60r               2.0
12      4.49u 8.34s 4.40r               11.0
16      4.36u 13.16s 4.43r              14.7

here's a fun little calculation:
        16 threads * 4.43 s * 0.147 + 1.83s baseline
                = 10.41936 + 1.83 thread*s
                = 12.25s
it seems that increased ilock contention is a big factor
in the increase in system time.

ilock accounting has most (>80%) long-held ilocks
(>8.5µs, ~21k cycles) starting here /sys/src/libc/port/pool.c:1318.
this is no surprise.  technically, a long-held ilock is not
really a problem—until somebody else wants it.  but we
can be fairly certain that allocb/malloc is a fairly contended code
path.

hopefully i'll be able to test a less-contended replacement for
allocb/freeb before i run out of time with this machine.

> Certainly it seems to me that idlehands needs to be fixed,
> your bit array "active.schedwait" is one way.

i'm not convinced that idlehands is anything but a power-waster.
performance wise, it's nearly ideal.

- erik

Reply via email to