I realized at least a bit of the problem. My Trilinos 10.2 build has

  -D CMAKE_BUILD_TYPE:STRING=DEBUG \

set (as do many of the example scripts that I cadged this from). Changing this 
to 

  -D CMAKE_BUILD_TYPE:STRING=RELEASE \

institutes some optimizations (and also produces consistent segfaults when 
built with MPI support). 

With a *serial* RELEASE build, I now get:

                                total    _prepareLinearSystem    _solve (i)    
precon
PySparse default (PCG/SSOR)       73.0         60.1                 8.9
Trilinos default (GMRES/DD)      143.8         57.2                83.7         
 27.0
Trilinos, PCG, no precon          97.5         55.9                37.7

With optimization, the Trilinos solves are about 30% faster (when I scale the 
new and old _prepareLinearSystem and PySparse._solve times). This still leaves 
Trilinos in the dust, but its something. Now to figure out why I can't build it 
against MPI.





On May 26, 2010, at 5:21 PM, I wrote:

> 
> 
> On May 25, 2010, at 9:37 PM, Daniel Wheeler wrote:
> 
>> Maybe try without any preconditioner or with only Jacobi.
> 
> Here's what I get for a variety of configurations (2 runs each, showing 
> pretty good run-to-run consistency). 
> 
> The default solver and preconditioner definitely seem bad, at least for this 
> problem. Of this batch, Trilinos' PCG solver with no preconditioner at all is 
> preferred, but is still substantially slower than PySparse's PCG.
> 
>                                 total    _prepareLinearSystem    _solve (i)   
>  precon
> PySparse default (PCG)            71.9         49.4                 7.3
> PySparse default                  61.6         50.2                 7.4
> Trilinos default (ii)            170.6         50.4               116.2       
>    42.2
> Trilinos default                 168.6         50.3               114.7       
>    42.0
> Trilinos, GMRES, no precon (iii) 117.2         50.7                62.4
> Trilinos, GMRES, no precon       116.6         50.2                62.7
> Trilinos, GMRES, Jacobi          123.9         51.0                69.1
> Trilinos, GMRES, Jacobi          120.2         50.2                66.0
> Trilinos, PCG, no precon         101.1         49.9                47.2
> Trilinos, PCG, no precon         101.9         51.2                46.9
> 
> 
> (iv)
> mpirun -np 2, GMRES, no precon    75.5         29.1                42.5
> mpirun -np 2, PCG, no precon      68.4         31.7                32.6
> 
> 
> 
> (i) includes preconditioning
> 
> (ii) GMRES, MultiLevelDDPreconditioner
> 
> (iii) obtained with 
> 
> solver = DefaultSolver(precon=None)
> 
> and
> 
>    phaseEq.solve(phase, dt=dt, solver=solver)
>    heatEq.solve(dT, dt=dt, solver=solver)
> 
> (iv) profiling results for parallel are dubious. They likely only represent 
> results for a single process, at best. Moreover, the .prof file was garbled 
> the second time I tried this, for both configurations. Still, they all seem 
> to scale consistently with wall-clock time.
> 



Reply via email to