Re: Slow performance on Trilinos? -- redux

Daniel Wheeler Wed, 16 Sep 2009 08:07:27 -0700

Hi,

Nice to see that you are persisting with fipy. I haven't really done
any direct comparisons of pysparse and trilinos. However, I would be
surprised if trilinos was worse in any given category, mainly because
it is widely used and well supported, but the proof is in the pudding.
I would make the following points:

  * A comparison should obviously be as like for like as possible,
same solver, same tolerance and consistency in any other parameters
that might be relevant. These settings are not all available through
the fipy solver class, so you may have to alter these directly in the
calls to the solver within the various solver classes.

 * Pysparse's non-symmetric krylov solvers don't work correctly (at
least that's what I've observed) so we mainly only use it for LU or
PCG

 * The tolerance may have different definitions. Certainly with
trilinos the definition of the stopping criteria can be changed to
about one of five definitions.

 * Once you have the settings correct, any given solver should be
doing exactly the same number of iterations for both pysparse and
trilinos and give the same answer to machine precision (I would
imagine). Then you can be sure that it is like for like.

 * Use a profiler to see how long the solver step is taking
<http://docs.python.org/library/profile.html>.

 * Quite detailed analysis of the solver step is available in
trilinos, see the commented section here
<http://matforge.org/fipy/browser/trunk/fipy/solvers/trilinos/trilinosAztecOOSolver.py#L82>.
I don't think there is anything equivalent for pysparse.

 * I'd be very interested to see a side-by-side comparison for a
typical step showing the solver used, number of iterations in the
solver, time per iteration in the solver, difference in final answers,
 memory high water mark (or memory used before and after the iteration
starts). Anyway, those are the type of numbers I think would be
helpful.

Hope this helps.

On Tue, Sep 15, 2009 at 3:26 PM, jtg <[email protected]> wrote:
>
> Dear All,
>
> I apologize if this e-mail seems like resurrecting a dead horse, only
> to flog the poor beastie one more time.  We certainly would appreciate
> knowing if anyone has had more insight on this issue.
>
> Back in April, there were about five e-mails on the subject "Slow
> performance on Trilinos?", initiated by Sum Thai Wong.  The question
> dealt with the increased time required to run the FiPy test suite
> using the "--Trilinos" option, as compared with the default PySparse
> solvers.  The consensus among the responders was that this was a
> consequence of some extra overhead when invoking trilinos, on the
> many small tests comprising the sandbox.
> (See 
> http://search.gmane.org/?query=trilinos+performance&group=gmane.comp.python.fipy)
>
> We recognize that this supposition is entirely reasonable, especially
> in light of the fact that some right-thinking users do indeed experience
> a substantial performance improvement. Unfortunately we see a slow-
> down, similar in magnitude -- around 30% -- to that reported by Sum
> Thai Wong in our attempt at using FiPy with trilinos to solve a set of
> phase field equations.  We are using a Debian release with Linux
> kernel 2.6.26 on a machine with four amd64 processors.  There has
> been NO attempt to parallelize the code; in fact, to eliminate any
> possible slowdown due to interaction with openMPI, we built trilinos
> in its serial incarnation, as demonstrated in our trilinos configure
> command
>
> /configure \
> CXXFLAGS="-O3" CFLAGS="-O3" FFLAGS="-O5 -funroll-all-loops -malign-double" \
> --prefix=/.../TRILINOS_SERIAL \
> --cache-file=config.cache \
> --with-cxxflags=-fPIC --with-cflags=-fPIC --with-fflags=-fPIC \
> --with-gnumake \
> --with-python=/usr/bin/python \
> --enable-epetra --enable-aztecoo --enable-pytrilinos --enable-ml \
> --enable-ifpack --enable-amesos --enable-galeri
>
> Our FiPy runs consist of hundreds of thousands of iterations, and
> in my opinion the performance hit that we see can not be set down
> to the logistical or administrative details that may explain the additional
> time for the completion of the test suite scripts.
>
> A 4x speed-up, as others have reported, would bring a gleam to our
> weary eyes and a ray of sunshine to our dreary researches, since
> at present each datapoint takes a day to collect.  Thanks for any sug-
> gestions or advice... especially the useful kind that solves the problem.
>
> Regards,
> J. Gathright
>
>
> P.S. Just curious:  would it be difficult to measure the time the tests
> spend in the critical portions of the tests?  If the value reported by
> "setup.py test" is only elapsed clock time and non-solver housekeeping
> segments of the scripts can significantly influence the results, it may be
> helpful to have a figure of merit targeted toward the really crucial code
> blocks.
>
>
>

-- 
Daniel Wheeler

Re: Slow performance on Trilinos? -- redux

Reply via email to