On May 28, 2010, at 10:57 AM, Igor wrote:
> As I understand it now the problem is that Trilinos solvers are not so > efficient as Pysparse, so when we're switching additional processors > and in the same time starting to use Trilinos, the overall performance > gained by additional proccessors is smeared out by ineffectiveness of > Trilinos solvers :-(. Accurate enough. > As I understand this is on the side of Trilinos > community and FiPy developers are not able to fix it (please, correct > me if I'm wrong), so we have to live with this for *some* time... We are still doing some diagnostics to determine exactly where the problem lies, if for no other reason so that we can file a proper report with the Trilinos developers. It remains possible (but increasingly unlikely) that there is some change we can make that will bring these libraries into parity. Even without that, your own benchmarks show that for larger systems, running FiPy in parallel can be of benefit with as few as two cores compared to serial runs. It of course all depends on your problem, your available hardware, and how quickly you need an answer. > In case it is still important or for another point in statistics I > give my results for "anisotropy test": Thanks for these. > How do you get _prepareLinearSystem time and _solve time separately? You need to run a profiler on the code. The easiest way that I've found is to run kernprof.py on the test script and then runsnake on the generated .prof file. http://packages.python.org/line_profiler/ http://www.vrplumber.com/programming/runsnakerun/ It's possible to directly use the built-in profile or cProfile modules with pstats, but they're more cumbersome for a quick check. > I think, however, that > we will return to the issue once more when we understand how the > solver affects one's results... When we set PySparse and Trilinos to both use the PCG solver without a preconditioner and to use the same metric (b-vector) to gauge tolerance, we find that they do an identical number of iterations and produce identical residual vectors. I am thus not presently concerned about the results, just the time. You earlier said you had cases were parallel produced different answers from serial. We would definitely like to see examples of that so that we can diagnose, because it has not been our experience. > In small systems I didn't see the gain at all, > but just performance decreased. I'm not particularly surprised by that. > PS2: > I didn't see this line (-D CMAKE_BUILD_TYPE:STRING=DEBUG \) in my > build. What is the default one >From reading cmake/TrilinosCMakeQuickstart.txt, it's not obvious to me what >the default is. > or how can I check what type do I have? It may be reported in CMakeCache.txt We're doing some tests on a Debian system that had Trilinos built with RELEASE and are still seeing Trilinos as substantially slower, so this is no silver bullet, but it definitely was a problem on my machine.
