Re: Solving in Parallel

Jonathan Guyer Fri, 28 May 2010 08:34:13 -0700


On May 28, 2010, at 10:57 AM, Igor wrote:


> As I understand it now the problem is that Trilinos solvers are not so
> efficient as Pysparse, so when we're switching additional processors
> and in the same time starting to use Trilinos, the overall performance
> gained by additional proccessors is smeared out by ineffectiveness of
> Trilinos solvers :-(.

Accurate enough.

> As I understand this is on the side of Trilinos
> community and FiPy developers are not able to fix it (please, correct
> me if I'm wrong), so we have to live with this for *some* time...

We are still doing some diagnostics to determine exactly where the problem 
lies, if for no other reason so that we can file a proper report with the 
Trilinos developers. It remains possible (but increasingly unlikely) that there 
is some change we can make that will bring these libraries into parity. Even 
without that, your own benchmarks show that for larger systems, running FiPy in 
parallel can be of benefit with as few as two cores compared to serial runs. It 
of course all depends on your problem, your available hardware, and how quickly 
you need an answer. 

> In case it is still important or for another point in statistics I
> give my results for "anisotropy test":

Thanks for these.


> How do you get _prepareLinearSystem time and _solve time separately?

You need to run a profiler on the code. The easiest way that I've found is to 
run kernprof.py on the test script and then runsnake on the generated .prof 
file.

http://packages.python.org/line_profiler/
http://www.vrplumber.com/programming/runsnakerun/

It's possible to directly use the built-in profile or cProfile modules with 
pstats, but they're more cumbersome for a quick check.

> I think, however, that
> we will return to the issue once more when we understand how the
> solver affects one's results...

When we set PySparse and Trilinos to both use the PCG solver without a 
preconditioner and to use the same metric (b-vector) to gauge tolerance, we 
find that they do an identical number of iterations and produce identical 
residual vectors. I am thus not presently concerned about the results, just the 
time. You earlier said you had cases were parallel produced different answers 
from serial. We would definitely like to see examples of that so that we can 
diagnose, because it has not been our experience.


> In small systems I didn't see the gain at all,
> but just performance decreased.

I'm not particularly surprised by that.

> PS2:
> I didn't see this line (-D CMAKE_BUILD_TYPE:STRING=DEBUG \) in my
> build. What is the default one

>From reading cmake/TrilinosCMakeQuickstart.txt, it's not obvious to me what 
>the default is. 

> or how can I check what type do I have?

It may be reported in CMakeCache.txt


We're doing some tests on a Debian system that had Trilinos built with RELEASE 
and are still seeing Trilinos as substantially slower, so this is no silver 
bullet, but it definitely was a problem on my machine.

Re: Solving in Parallel

Reply via email to