Dinesh, There is a another thing we can do for the SSOR-CG in parallel, that is write the code for Eisenstat's trick. This would save roughly 40 % of the flops in the MatMult/MatRelax() and require only one pass from memory of the matrix entries, instead of the current two.
It is a bit of work/debugging and testing, several good days. I don't know who has time. Barry On Sun, 5 Aug 2007, Dinesh Kaushik wrote: > Barry, > > Thank you very much for the i-node version of MatRelax. I will test it out in > a few days on Jaguar. Right now, I am busy writing the INCITE proposal for > UNIC. This reminds me to ask you for a two page bio (in Word or pdf format). > It can be very close to what you used for the SciDAC last year (see > instructions at > http://hpc.science.doe.gov/allocations/incite/instructions.do). > > I am able to run the full core mesh (33 groups) on up to 4096 processors with > 57% efficiency (wrt 512 processors). The problem size (P3) per processor is > quite small with 4096 subdomains. The higher orders will show better > scalability but take too long to converge and getting thro' the queue has > become very slow on Jaguar lately. We will discuss the Computational Readiness > (scalability) section of the proposal before it gets submitted on Wednesday. > > Thanks, > > Dinesh > > Barry Smith wrote: > > Dinesh, > > > > I have completed and pushed to petsc-dev an i-node version of > > MatRelax(). Please use this version in all your future runs with UNIC. It > > should be a bit faster and maybe save a few iterations. Please let me know > > how it performs. If you see worse or failed convergence > > please let me know immediately so it can be debugged and fixed. > > > > Barry > > > > This will have no affect on CFDShip since that code does not have i-nodes. > > > >
