Few more thoughts, then we have exhausted my repertoire on the facts at hand ...
1) Make sure that you get the latest bios upgrade for the motherboard ... they keep tweaking voltage and timing parameters. 2) do not manually set the RAM parameters...set all memory speed/voltage to auto/default, then choose the SLI Memory setting for 0% OverCLOCK mode, which enables the motherboard to set speed/voltage from the EPP in the ram chips. 3) disable all of the "spread spectrum" timings 4) disable all of the CPU auto-speed selection option [ie: disable "CPU Thermal control" and "Intel Speedstep"] 5) if your memory is still unstable, I would suspect northbridge overheating ... the evga board comes with an aux fan for the northbridge for use when watercooling the cpu. 6) get the latest nForce drivers from NVidia download site. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Woods Sent: Monday, December 24, 2007 8:01 AM To: The Great Internet Mersenne Prime Search list Subject: Re: [Prime] Benchmark question re: multiple cores Paul Charlton wrote: > You never stated which motherboard chipset/northbridge is in your system. I > have essentially an identical system (Nvidia 680i), but with CPU x12 > (3200MHz), air cooled, and get substantially better results. The Corsair > dominator I have run well @ FSB 1066/linked, with 2.2v for memory banks. > Unlinked memory speeds are measurably slower unless they hit a "sweet" > timing ratio with the FSB. > I thought I had stated the chipset in a later Email, but the mobo is an eVGA 122-CK-NF68, an nVidia 680i: http://www.evga.com/products/pdf/122-CK-NF68.pdf I have been unable to achieve any stability running the RAM linked at 1066, even with no multiplier overclock. Any effort to set the RAM bus speed to "linked, 1:1" (i.e. 1066) resulted in system instability at worst, or SUMINP != SUMOUT errors at best. > What will help a lot if you have not done it (the mailing list is silent on > this issue) is to set "cpu affinity" for each of your P95 threads (can do > from task manager with a right click) > This should not be necessary, given that I have it set CPU affinity directly from the "Advanced" -> "Affinity" menu option within each instance of P95. However, I just now altered the affinity settings in Task Manager, carefully matching each PID's affinity to its instance's internal "Advanced -> Affinity" settings. This did help not help in the slightest. Thanks for your efforts, in any case. I'm going to try migrating to the Intel X34 chipset to see if that helps. Jeff > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Jeff Woods > Sent: Friday, December 21, 2007 9:38 AM > To: The Great Internet Mersenne Prime Search list > Subject: [Prime] Benchmark question re: multiple cores > > Hello All, > > My second (and hopefully last) question after returning to the GIMPS > fold. I've never unsub'd from the mailing list, though, and haven't seen > this question come across.... > > I have a Quad Core QX6700, which is my first multi-core system, and my > first overclocked system. It has just finished up P-1 stage 2 factoring > and begun LL tests on each of its cores. > > http://www.mersenne.org/bench.htm > > The benchmark page above says that a Core 2 Quad QX6700 should be > running about 0.0569 per iteration on a 2560K FFT. I assume that's at > default clocks. > > My system seems to be averaging less than half that, between 0.105 and > 0.131, depending on which core.... and I can't figure out why. What > should take 24 days is going to take 60 days at this rate. > > The following may be irrelevant, but I'm wondering.... > > The native 2.66 Ghz CPU is overclocked to 3.47 Ghz, with Vista 64 (and > Prime95's 64-bit version) so I can have full access to all of 4 GB of > very fast RAM. The RAM is Corsair Dominator PC2-8500 (800 Mhz native, > 5-5-5-18-2T) and for extra memory bandwidth I've overclocked the RAM to > 1000 Mhz. The CPU overclocking is all done via multiplier (10X -> 13X) > and voltage, and the RAM overclock is done by configuring the > motherboard's RAM speed to be unlinked from the FSB, then manually > increasing the RAM speed to its max stable level of 1000 Mhz. I did not > alter the FSB speed at all, since my RAID controller doesn't like FSB > speed tweaks. Nor did I alter the memory timings, leaving them at the > native 5-5-5-18-2T. The system is sufficiently cooled (liquid, > sustaining 59C-63C at 100% load on all 4 cores) and passed overnight > torture testing without throwing up errors. > > With the above tweaks, the system throws up 12500 3DMarks, and scores an > overall 5.9 on Vista's Experience Rating (the best one can score). It > was scoring 5.8 prior to my increase of the memory speed from 800 Mhz to > 1000 Mhz, so my bump should have boosted memory bandwidth. Alas, > matching 1:1 and running the memory at 1066 Mhz (the native FSB speed of > the system) is not stable. I can't quite push the memory that fast -- > 1000 Mhz is where stability tops out. At 1066 Mhz memory speed, I start > getting rounding errors after 3-4 hours of torture testing. > > Now, here's what I'm wondering. Is it possible that the source of these > slower benchmarks is that tiny discrepancy between the FSB speed and the > RAM speed? Would there be timing delays in running the memory just > SLIGHTLY slower than the FSB that P95 doesn't much like due to its use > of RAM for the lookup tables? > > Or, is it simpler than that? Am I perhaps bumping up against L2 cache > thrashing, something that might be common on Intel multi-core machines > that share a single L2 cache like this? (Nehalem, where art thou?) Is > the benchmark listed simply what one core would do if it had exclusive > use of the L2 cache while other cores were idle, and the lower iteration > times are to be expected when all four cores are each working on their > respective exponents, contending for a single L2 cache? > > This last theory seems to be supported by the fact that I paused all but > one core and let it iterate on that one core with near-exclusive use of > the otherwise idle system, and the iteration times fell dramatically, to > 0.050 sustained and "best time" from the Options->Benchmark menu of > 0.048. That's more in line with what I'd expect, given the benchmark > pages showing a stock clocked QX6700 cranking at 0.0569, combined with > my overclock. > > So, to cut to the chase, the benchmarks seem to be geared to exclusive > use of the L2 cache, but in the real world that's not how I'd imagine > most GIMPS users run P95 on a multi-core system. If my hypothesis is > correct, wouldn't it be better to post separate benchmarks for "one core > in use" versus "all cores in use", so that people's expectations aren't > skewed by a benchmark table that doesn't represent typical use? > > Any guidance or experience would be welcome. > > Thanks! > > Jeff Woods > Reading, PA > > > > _______________________________________________ > Prime mailing list > [email protected] > http://hogranch.com/mailman/listinfo/prime > > _______________________________________________ > Prime mailing list > [email protected] > http://hogranch.com/mailman/listinfo/prime > > _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
