http://d.puremagic.com/issues/show_bug.cgi?id=4725
David Simcha <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #9 from David Simcha <[email protected]> 2011-05-15 09:33:10 PDT --- I'm going to have to agree with Bearophile on this one. Using multiple summation variables is both faster (because it breaks some dependency chains and uses instruction level parallelism better) and more accurate (because you effectively have more precision). Here's a test program: import std.stdio, std.datetime, std.random, std.range, std.traits; // Adapted from dstats.summary.sum(). auto ilpSum(T)(T data) if(isRandomAccessRange!T) { enum nILP = 8; // Empirically optimal on Athlon 64 X2. Unqual!(ElementType!T)[nILP] sum = 0; size_t i = 0; if(data.length > 2 * nILP) { for(; i + nILP < data.length; i += nILP) { foreach(j; 0..nILP) { sum[j] += data[i + j]; } } foreach(j; 1..nILP) { sum[0] += sum[j]; } } for(; i < data.length; i++) { sum[0] += data[i]; } return sum[0]; } auto naiveSum(T)(T data) if(isRandomAccessRange!T) { Unqual!(ElementType!T) ret = 0; foreach(elem; data) { ret += elem; } return ret; } void main() { auto nums = new double[10_000_000]; foreach(ref x; nums) x = uniform(0.0, 1.0); auto sw = StopWatch(AutoStart.yes); auto s = naiveSum(nums); immutable naiveTime = sw.peek.msecs; // Printing out s to make sure the compiler doesn't optimize away the // whole function. writefln("naive: Sum = %s, %s milliseconds.", s, naiveTime); sw.reset(); s = ilpSum(nums); immutable ilpTime = sw.peek.msecs; writefln("ilp: Sum = %s, %s milliseconds.", s, ilpTime); } Results: naive: Sum = 4.9988e+06, 51 milliseconds. ilp: Sum = 4.9988e+06, 33 milliseconds. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
