Hi Jed,
On 5/1/2012 4:59 PM, Jed Brown wrote: > On Thu, Jan 5, 2012 at 09:41, TAY wee-beng <zonexo at gmail.com > <mailto:zonexo at gmail.com>> wrote: > > I just did a -log_summary and attach the text file, running across > 8 and 16 processors. My most important concern is whether the load > is balanced across the processors. > > In 16 processors case, for the time, it seems that the ratio for > many events are higher than 1, reaching up to 6.8 for VecScatterEnd > > > This takes about 1% of the run time and it's scaling well, so don't > worry about it. > > and 132.1 (?) for MatAssemblyBegin. > > > This is about 2% of run time, but it's not scaling. Do you compute a > lot of matrix entries on processes that don't own the rows? I only compute rows which the processor own. Can it be the memory allocation? I'll check on that. > > Most of your solve time is going into PCSetUp() and PCApply, both of > which are getting more expensive as you add processes. These are more > than 10x more than spent in MatMult() and MatMult() takes slightly > less time on more processes, so the increase isn't entirely due to > memory issues. > > What methods are you using? What do you mean methods? I am doing Cartesian grid 3d CFD, using fractional mtd which solves the momentum and Poisson eqns. I construct the linear eqn matrix and insert them in PETSc matrix/vectors. Then I solve using Bicsstab and hypre AMG respectively. Why is PCSetUp() and PCApply using more time? > > However, for the flops, ratios are 1 and 1.1. so which is more > important to look at? time or flops? > > > If you would rather do a lot of flops than solve the problem in a > reasonable amount of time, you might as well use dense methods. ;-) Thanks again! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120109/bee728cb/attachment.htm>
