Brian Beesley wrote: >> Or, is it simpler than that? Am I perhaps bumping up against L2 cache >> thrashing, something that might be common on Intel multi-core machines >> that share a single L2 cache like this? >> > > BTW what happens if you run two instances? The reason I ask is that it's > possible that there is interference between access to main memory on two > banks when you're running more than two instances. If this theory is right > (as opposed to cache thrashing) then two instances would benchmark just a > little bit (one or two percent) slower than one instance, whereas starting a > third instance would cause the apparent performance in terms of iteration > time to plummet. >
Pretty good guess, Brian. Here's some results from testing based upon your idea: All exponents are of the close order of 41507900 -- all four were requested within moments of each other as part of a new setup. With just one core (core 1) cranking, I can get it as low as 0.050. With two cores (one per die, cores 1 and 3 working, cores 0 and 2 idle), the time is about 0.063. With two cores cranking on a single die (cores 2 and 3 working, cores 0 and 1 idle), the time is about 0.068 -- i.e. it doesn't (much) matter whether the cores are on the same die or not. Adding a third core to the equation produces mixed results. With cores 1, 2, and 3 cranking with core 0 idle, core 1 maintains a steady 0.074 iteration time (which I could probably live with), but the two cores that are on the same die together really contend with each other and results there are 0.094 average for core 3 and 0.118 for core 2. With all four cores cranking, the iteration times vary from .107 to .132 depending on the core -- double or worse than the timings of just two cores. The 4 GB RAM is 1GBx4 occupying all four slots on the motherboard. Are you suggesting that if I were instead using only one bank of RAM (either dropping to 2 GB total RAM -- which I'm not willing to do, or switching to 2GBx2 on a single bank which would be brutally expensive to get decently overclockable RAM) that this bottleneck would possibly disappear? Jeff _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
