On Sat, Mar 15, 2014 at 4:01 AM, Karl Rupp <[email protected]> wrote:
> Hi William, > > I couldn't find something really suspicious in the logs, so the lack of > scalability may be due to hardware limitations. Did you run all MPI > processes on the same machine? How many CPU sockets? If it is a > single-socket machine, chances are good that you saturate the memory > channels pretty well with one process already. With higher process counts > the cache per process is reduced, thus reducing cache reuse. This is the > only reasonable explanation why the execution time for VecMDot goes up from > e.g. 7 seconds for one and two processes to about 24 for four and eight > processes. > http://www.mcs.anl.gov/petsc/documentation/faq.html#computers > I suggest you try to run the same code across multiple machines if > possible, you should see better scalability there. Also, for benchmarking > purposes try to replace the ILU preconditioner with e.g. Jacobi, this > should give you better scalability (provided that the solver still > converges, of course...) > BJacobi/ASM would be the next thing to try, since it would scale in terms of communication, but not in terms of iterates. Eventually you will want a nice multilevel solver for your problem. Matt > Best regards, > Karli > > > > On 03/14/2014 10:45 PM, William Coirier wrote: > >> I've written a parallel, finite-volume, transient thermal conduction >> solver using PETSc primitives, and so far things have been going great. >> Comparisons to theory for a simple problem (transient conduction in a >> semi-infinite slab) looks good, but I'm not getting very good parallel >> scaling behavior with the KSP solver. Whether I use the default KSP/PC or >> other sensible combinations, the time spent in KSPSolve seems to not scale >> well at all. >> >> I seem to have loaded up the problem well enough. The PETSc >> logging/profiling has been really useful for reworking various code >> segments, and right now, the bottleneck is KSPSolve, and I can't seem to >> figure out how to get it to scale properly. >> >> I'm attaching output produced with -log_summary, -info, -ksp_view and >> -pc_view all specified on the command line for 1, 2, 4 and 8 processes. >> >> If you guys have any suggestions, I'd definitely like to hear them! And I >> apologize in advance if I've done something stupid. All the documentation >> has been really helpful. >> >> Thanks in advance... >> >> Bill Coirier >> >> ------------------------------------------------------------ >> -------------------------------------------------------- >> >> ***NOTICE*** This e-mail and/or the attached documents may contain >> technical data within the definition of the International Traffic in Arms >> Regulations and/or Export Administration Regulations, and are subject to >> the export control laws of the U.S. Government. Transfer of this data by >> any means to a foreign person, whether in the United States or abroad, >> without an export license or other approval from the U.S. Department of >> State or Commerce, as applicable, is prohibited. No portion of this e-mail >> or its attachment(s) may be reproduced without written consent of Kratos >> Defense & Security Solutions, Inc. Any views expressed in this message are >> those of the individual sender, except where the message states otherwise >> and the sender is authorized to state them to be the views of any such >> entity. The information contained in this message and or attachments is >> intended only for the person or entity to which it is addressed and may >> contain confidential and/or privileged material. If you >> > are not the intended recipient or believe that you may have received this > document in error, please notify the sender and delete this e-mail and any > attachments immediately. > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
