marlowsd: > 2009/4/20 Dave Bayer <ba...@cpw.math.columbia.edu>: > > I ran some longer trials, and noticed a further pattern I wish I could > > explain: > > > > I'm comparing the enumeration of the roughly 69 billion atomic lattices on > > six atoms, on my four core, 2.4 GHz Q6600 box running OS X, against an eight > > core, 2 x 3.16 Ghz Xeon X5460 box at my department running Linux. Note that > > my processor now costs $200 (it's the venerable "Dodge Dart" of quad core > > chips), while the pair of Xeon processors cost $2400. The Haskell code is > > straightforward; it uses bit fields and reverse search, but it doesn't take > > advantage of symmetry, so it must "touch" every lattice to complete the > > enumeration. Its memory footprint is insignificant. > > > > Never mind 7 cores, Linux performs worse before it runs out of cores. > > Comparing 1, 2, 3, 4 cores on each machine, look at "real" and "user" time > > in minutes, and the ratio: > > > > Linux > > 2 x 3.16 GHz Xeon X5460 > > 1 2 3 4 > > 466.7 250.8 183.7 149.3 > > 466.4 479.0 505.2 528.1 > > 1.00 1.91 2.75 3.54 > > > > OS X > > 2.4 GHx Q6600 > > 1 2 3 4 > > 676.9 359.4 246.7 191.4 > > 673.4 673.7 675.9 674.8 > > 0.99 1.87 2.74 3.53 > > > > These ratios match up like physical constants, or at least invariants of my > > Haskell implementation. However, the user time is constant on OS X, so these > > ratios reflect the actual parallel speedup on OS X. The user time climbs > > steadily on Linux, significantly diluting the parallel speedup on Linux. > > Somehow, whatever is going wrong in the interaction between Haskell and > > Linux is being captured in this increase in user time. > > We can't necessarily blame this on Linux: the two machines have > different hardware. There could be cache-effects at play, for > example. > > Maybe you could try the new affinity options (+RTS -qa) and see if > that makes any difference? That would reduce the effect of scheduling > effects due to the OS (although when the number of cores you use is > less than the real number of cores in the machine, the OS is still > free to move threads around. To get reliable numbers you should > really disable some of the cores at boot-time). >
Little advice and tidbits are creeping out of Simon's head. Is it time for a parallel performance wiki, where every question that becomes an FAQ gets documented live? http://haskell.org/haskellwiki/Performance/Parallel Maybe put details on the wiki so we can grow a large FAQ to capture this "oral tradition". -- Don _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users