Yes. Its fairly flat which I think is due to BoomerAMG. [8, 8, 9, 11, 12] On Fri, May 18, 2012 at 5:06 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, May 18, 2012 at 8:02 PM, Mohammad Mirzadeh <mirzadeh at > gmail.com>wrote: > >> Yes, I'm looking at weak scalability right now. I'm using BiCGSTAB with >> BoomerAMG (all default options except for rtol = 1e-12). I've not looked >> into MF/s yet but I'll surely do to see if I'm having any problem there. So >> far, just timing the KSPSolve, I get [0.231, 0.238, 0.296, 0.451, 0.599] >> seconds/KSP iteration for p=[1, 4, 16, 64, 256] with almost 93K nodes >> (matrix-row) per proc. Which is not bad I guess but still increased by a >> factor of 3 for 256 proc. Problem is, I don't know how good/bad this is. In >> fact I'm not even sure that is a valid question to ask since it may be very >> problem dependent. >> >> Something I just though about, how crucial is the matrix structure for >> KSP solvers? The nodes have bad numbering and I do partitioning to get a >> better one here. >> > > Did you look at the number of iterates? > > Matt > > >> >> On Fri, May 18, 2012 at 4:47 PM, Matthew Knepley <knepley at gmail.com>wrote: >> >>> On Fri, May 18, 2012 at 7:43 PM, Mohammad Mirzadeh <mirzadeh at >>> gmail.com>wrote: >>> >>>> I see; well that's a fair point. So i have my timing results obtained >>>> via -log_summary; what should I be looking into for MatMult? Should I be >>>> looking at wall timings? Or do I need to look into MFlops/s? I'm sorry but >>>> I'm not sure what measure I should be looking into to determine >>>> scalability. >>>> >>> >>> Time is only meaningful in isolation if I know how big your matrix is, >>> but you obviously take the ratio to look how it is scaling. I am >>> assuming you are looking at weak scalability so it should remain >>> constant. MF/s will let you know how the routine is performing >>> independent of size, and thus is an easy way to see what is happening. >>> It should scale like P, and when that drops off you have >>> insufficient bandwidth. VecMDot is a good way to look at the latency of >>> reductions (assuming you use GMRES). There is indeed no >>> good guide to this. Barry should write one. >>> >>> Matt >>> >>> >>>> Also, is there any general meaningful advice one could give? in terms >>>> of using the resources, compiler flags (beyond -O3), etc? >>>> >>>> Thanks, >>>> Mohammad >>>> >>>> On Fri, May 18, 2012 at 4:18 PM, Matthew Knepley <knepley at >>>> gmail.com>wrote: >>>> >>>>> On Fri, May 18, 2012 at 7:06 PM, Mohammad Mirzadeh <mirzadeh at gmail.com >>>>> > wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> I'm trying to generate scalability plots for my code and do profiling >>>>>> and fine tuning. In doing so I have noticed that some of the factors >>>>>> affecting my results are sort of subtle. For example, I figured, the >>>>>> other >>>>>> day, that using all of the cores on a single node is somewhat (50-60%) >>>>>> slower when compared to using only half of the cores which I suspect is >>>>>> due >>>>>> to memory bandwidth and/or other hardware-related issues. >>>>>> >>>>>> So I thought to ask and see if there is any example in petsc that has >>>>>> been tested for scalability and has been documented? Basically I want to >>>>>> use this test example as a benchmark to compare my results with. My own >>>>>> test code is currently a linear Poisson solver on an adaptive quadtree >>>>>> grid >>>>>> and involves non-trivial geometry (well basically a circle for the >>>>>> boundary >>>>>> but still not a simple box). >>>>>> >>>>> >>>>> Unfortunately, I do not even know what that means. We can't guarantee >>>>> a certain level of performance because it not >>>>> only depends on the hardware, but how you use it (as evident in your >>>>> case). In a perfect world, we would have an abstract >>>>> model of the computation (available for MatMult) and your machine (not >>>>> available anywhere) and we would automatically >>>>> work out the consequences and tell you what to expect. Instead today, >>>>> we tell you to look at a few key indicators like the >>>>> MatMult event, to see what is going on. When MatMult stops scaling, >>>>> you have run out of bandwidth. >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Mohammad >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120518/d4223442/attachment-0001.htm>
