Cc'ing Edmond since he's the one that pointed me to the Arnold paper. Matthew Knepley <[email protected]> writes:
> On Wed, Aug 20, 2014 at 4:31 PM, Jed Brown <[email protected]> wrote: > >> Barry Smith <[email protected]> writes: >> > My statement is based on experience, not models. I stand by it and >> > yes it can be factors of that magnitude. Now one could argue 3 >> > times faster so what, but if you are doing this solve millions of >> > times and it is the most time consuming part of the simulation (by >> > far) then it adds up. >> >> Here is one comparison: >> >> http://dx.doi.org/10.1103/PhysRevE.88.063308 >> > > Can you understand the scaling plots in that thing? The relative efficiency plots? That is a fantastic way to compare parallel performance. See the explanation starting near the end of page 14. Note that Bolten's thesis (PP3MG - http://d-nb.info/99408403X/34) uses V-cycles instead of FMG (I don't see this explicitly stated in Arnold's paper, but they don't mention FMG), so there's probably a modest integer factor still on the table. Bolten's thesis also does not discuss coarse grid process reduction, so the scaling might cut off prematurely. > And they make no attempt at modeling. These kind of paper may help a > small segment of people running that exact problem on a similar > architecture, but they really do not help sort this out. Yes, the results are very implementation-specific, in contrast to modeling papers that usually lose the constants. It would have been nice for them to have included models, but the only way to really get the constants right is for the community to compete until there is no fat left to trim. And even then we occasionally see breakthroughs.
pgpQDQdQzmHYm.pgp
Description: PGP signature
