On Fri, Feb 27, 2015 at 10:29 AM, Todd Gamblin <[email protected]> wrote:
> Barry: > > I remember that ALCF attempted to address this problem at one point or > another with "tmpicc" compiler wrappers. As I remember the idea was that > they stored the compiler's tmp files in some local storage on the login > node. I think that was back when ANL's main machine was Intrepid, and I > don't know where those compilers went on Mira. Do you remember this? > > In general I'm not sure that just moving the compiler temp files is going > to cut it. I think you really want to do the build out of /tmp or some > other filesystem. Spack does this automatically for its builds -- on LLNL > machines I build much faster by just finding the local tmp space and using > it for all the builds. Spack is also able to put the entire build out in > tmp space, because you just tell it the software name, and it handles the > details of where it is downloaded and expanded. It's not perfect, because > it looks at $TMP, $TMPDIR, and some other LLNL-specific places. > We already do this in configure. Matt > If it turns out that configuring NFS (or in ANL's case, I think it's GPFS) > to be fast on a set of loaded login nodes is not feasible, it might be > nice to have some kind of recommendations for build staging. > > -Todd > > > > On 2/27/15, 8:09 AM, "David E. Bernholdt" <[email protected]> wrote: > > >Barry, thanks, this is extremely helpful. I'll have the OLCF folks > >contact Nathan if they need any further info or have other experiments > >to try. > > > >On 02/27/2015 11:03 AM, Barry Smith wrote: > >> > >> Same text also in the attachment. > >> > >> Barry > >> > >> David, > >> > >> Nathan Collier has kindly run a test on Titan, Satish on Mira and > >>Hopper, and Victor on Ranger with a basic optimized build of PETSc (all > >>C code) > >> > >> Please find below some configure and make timings from the latest > >>PETSc master. > >> > >> The Titan times for both configure and make are unacceptable. For > >>total build time Titan is 3.5 times slower than Mira and Hopper and at > >>least 10 times slower than laptops. The "time" results on Titan are > >>disturbing > >> > >> configure > >> real 14m32.169s (since the user + sys time is much less than real > >>time, what is it waiting on?) > >> user 1m51.527s > >> sys 3m40.734s > >> > >> make > >> real 15m56.004s > >> user 8m8.971s > >> sys 52m42.734s (why so much?) > >> > >> which I read as either the filesystem or the compiler system (location > >>of the compilers, license server of the compilers, ...) is really badly > >>configured. > >> > >> The Hopper configure time with the default > >>TMPDIR=/scratch/scratchdirs/balay is is unacceptable but if you actually > >>use the real /tmp it becomes somewhat reasonable. > >> > >> Feel free to share this information with local experts, > >> > >> > >> > >> > >> I suggest you view the below table in a fixed width font editor like > >>Emacs or Vi so the columns line up. > >> > >> configure time make time Total compilers > >> filesystem > >> > >> Titan 14m32s 15m56s 30m28s Intel 14 > >> /lustre/atlas1/geo103/proj-shared/ > >> 41m38s 9m5s 50m43s > >> /ccs/home/ (no load on login node) > >> 13m > >>(no load on a different login node) > >> > >> Mira 6m59s 1m49s 8m48s IBM > >> /gpfs/mira-home/ > >> > >> Hopper 23m17 1m45s 25m2s > >> /global/u2/b/balay/petsc.clone default > >>TMPDIR=/scratch/scratchdirs/balay > >> 6m17s 1m39s 7m57s > manually > >>set TMPDIR=/tmp > >> > >> NSF Ranger UT Austin 5m10s 1m28s 6m38s > >> default, whatever it is > >> > >> Linux laptop 53s 1m13s 2m6s Gnu > >> compile and compiler local > >> > >> Apple laptop 1m14s 54s 2m8s clang > >> compile and compiler local > >> > >> Linux workstation 1m11s 22s 1m33s Gnu > >> compile and compiler local > >> 1m37s 29s 2m6s Gnu > >> compile directory local; compiler directory remote > >> 3m11s 25s 3m36s Intel 13 > >> compile directory local; compiler directory remote > >> > >> PETSc has about 1000 source files that need compiling > >> > >> The configure is essentially sequential, the make extremely parallel. > >> > >> During configure the source code is on the listed file system, all .o > >>and executables are on /tmp > >> > >> During the make the source code and all .o are on the listed file system > >> > >> > >>> On Feb 25, 2015, at 11:23 AM, David E. Bernholdt > >>><[email protected]> wrote: > >>> > >>> At the kick-off meetings, one of the general complaints I heard > >>> expressed about the facilities was the slow build times compared to > >>> personal systems. > >>> > >>> If you have this complaint and are an OLCF user, and are willing to > >>>work > >>> with us a little to try to understand your experience in more detail, > >>> please contact me (individually, not reply-all). > >>> > >>> This is a facility thing, not an IDEAS thing, so I can't speak for the > >>> other facilities. But we've recently received some other similar > >>> comments, and we're trying to dig into what's happening. > >>> > >>> Thanks > >>> -- > >>> David E. Bernholdt | Email: [email protected] > >>> Oak Ridge National Laboratory | Phone: +1 865-574-3147 > >>> http://www.csm.ornl.gov/~bernhold | Fax: +1 865-576-5491 > >>> _______________________________________________ > >>> Ideas-team mailing list > >>> [email protected] > >>> https://lists.mcs.anl.gov/mailman/listinfo/ideas-team > > > > > >-- > >David E. Bernholdt | Email: [email protected] > >Oak Ridge National Laboratory | Phone: +1 865-574-3147 > >http://www.csm.ornl.gov/~bernhold | Fax: +1 865-576-5491 > >_______________________________________________ > >Ideas-team mailing list > >[email protected] > >https://lists.mcs.anl.gov/mailman/listinfo/ideas-team > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
