On May 24, 2010, at 11:49 PM, jtg wrote: > Continuing with Dr. Guyer's anisotropy example, here are > our times for his 3 mesh sizes (same 10 steps, but no > viewer) > > solver 100x100 500x500 1000x1000 > ---------------------------------------------------------------------------- > pysparse 2.4 43 171 seconds > trilinos 1 proc 4.5 103 480 > trilinos 2 proc 2.9 54 263 > trilinos 4 proc 3.3 37 192 > > Note the near equivalence of the trilinos-1 to pysparse ratio, > between Guyer's result and the above, in the case of the 1000 > by 1000 mesh: 2.9 and 2.8. It is our results for both 2 and > 4 processors working on the 500x500 mesh that seem "in- > consistent" in this little table.
If anything is inconsistent, I suspect it's my results. I ran my tests on a little laptop with a GUI-intensive OS and continued to try to get other work done while the tests were running. I've done some profiling in the perverse hope that we're just doing something silly when setting up our Trilinos solver, but I'm not seeing it. For 10 steps of a 500x500 problem without viewers, both PySparse and Trilinos runs spend 53 s in Term._prepareLinearSystem(). That much is good; they should be doing virtually (if not precisely) the same thing. The PySparse run spends another 7.3 s in pysparse.itsolvers.pcg(). Lots of little things bring the total run time to 65 s. In contrast, the Trilinos run spends 43 s in ComputePreconditioner(), 40 s in Iterate(), 17 s in InsertGlobalValues(), and 7 s in FillComplete(), for a total run time of 175 s. At least 60% of the time is thus spent in atomic Trilinos calls that we have no control over. Even the matrix solve itself (Iterate()) is over twice as slow as PySparse. At this point, my only hope is that we're passing very poor parameters to the preconditioner, but we'll need to spend some time exploring that.
