Hi Howard, I should be able to get a hold of an 8-way Xeon system in January sometime. I will be able to place the order for it on the 2nd.
Cheers, Alex On Dec 23, 2007 8:40 PM, Howard Chu <[EMAIL PROTECTED]> wrote: > Has anyone got a dual or quad socket Intel Xeon based server for testing? > I've > been testing on two AMD systems, one quad socket dual core and one dual > socket > quad core. There are a lot of different ways to tune these systems... > > slapd currently uses a single listener thread and a pool of some number of > worker threads. I've found that performance improves significantly when > the > listener thread is pinned to a single core, and no other threads are > allowed > to run there. I've also found that performance improves somewhat when all > worker threads are pinned to specific cores, instead of being free to run > on > any of the remaining cores. This has made testing a bit more complicated > than > I expected. > > I originally was just pinning the entire process to a set number of cores > (first 1, then 2, incrementing up to 8) to see how performance changed > with > additional cores. But due to the motherboard layout and the fact that the > I/O > bridges are directly attached to particular sockets, it makes a big > difference > exactly which cores you use. > > Another item I noticed is that while we scale perfectly linearly from 1 > core > to 2 cores in a socket (with a dual-core processor), as we start spreading > across multiple sockets the scaling tapers off drastically. That makes > sense > given the constraints of the Hypertransport connections between the > sockets. > > On the quad-core system we scale pretty linearly from 1 to 4 cores (in one > socket) but again the improvement tapers off drastically when the 2nd > socket > is added in. > > I don't have any Xeon systems to test on at the moment, but I'm curious to > see > how they do given that all CPUs should have equal access to the > northbridge. > (Of course, given that both memory and I/O traffic go over the bus, I'm > not > expecting any miracles...) > > The quad-core system I'm using is a Supermicro AS-2021M-UR+B; it's based > on an > Nvidia MCP55 chipset. The gigabit ethernet is integrated in this chipset. > Using back-null we can drive this machine to over 54,000 > authentications/second, at which point 100% of a core is consumed by > interrupt > processing in the ethernet driver. The driver doesn't support interrupt > coalescing, unfortunately. (By the way, that represents somewhere between > 324,000pps and 432,000pps. While there's only 5 LDAP packets per > transaction, > some of the client machines choose to send separate TCP ACKs, while others > don't, which makes the packet count somewhere between 5-8 packets per > transaction. I hadn't taken those ACKs into account when I discussed these > figures before. At these packet sizes (80-140 bytes), I think the network > would be 100% saturated at around 900,000pps.) > > Interestingly, while 2 cores can get over 13,000 auths/second, and 4 cores > can > get around 25,000 auths/second (using back-hdb), with all 8 cores it's > only > peaking at 29,000 auths/second. This tells me it's better to run two > separate > slapds in a mirrormode configuration on this box (4 cores per process) > than to > run a single process across all of the cores. Then I'd expect to hit > 50,000 > auths/second total, pretty close to the limits of the ethernet > device/driver. > -- > -- Howard Chu > Chief Architect, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/ >
