I don't feel as strongly about this as Ralph, but do think the new behavior violates the "principle of least surprise".

-Paul

Ralph Castain wrote:
Been thinking about this more today, and I actually find this new "feature" disturbing. It bothers me that OMPI is now dictating that it will do a parallel build without my knowledge unless I specifically tell it not to. If it were technically possible, would we next force "make -j4"?? How would the developer community feel if the authors of "make" suddenly decided that it would run 4 parallel threads under the covers unless you specifically told it not to?

What bugs me here is that I now have to remember to set something in my environment to tell OMPI "you don't get to hog all my processors". Maybe others twiddle their thumbs and leave the computer alone while OMPI builds, or maybe they rarely build - but I build frequently, and I am always multi-tasking my time (running Word, Powerpoint, etc.). So having OMPI default to running a parallel build is more than a little annoying - frankly, it pisses me off.

I really feel that this "feature" should be implemented as an option passed to autogen instead of a hidden forced behavior. If someone wants to run a parallel build, then by all means let them ask for it (ala "make -j4"). But don't just -do- it.

Grrrr....
Ralph


On Fri, Sep 24, 2010 at 7:28 AM, Ralph Castain <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:

    I hope you'll understand if I don't run that test while on the
    road...one battery yank per week is my limit :-)


    On Fri, Sep 24, 2010 at 4:40 AM, Jeff Squyres (jsquyres)
    <jsquy...@cisco.com <mailto:jsquy...@cisco.com>> wrote:

        Also to clarify:

        - did autogen set am-jobs to 2 in your case?  (it should do
        that if lstopo is not found - it also limits itself to 4 at max)

        - in the same scenario, what happens if you manually set
        am-jobs to 1 and run autogen?  Ie do you get the same
        heat/sluggishness?  I have experienced vms causing this kind
        of behavior just because they are running - causing CPU and
memory pressure. Sent from my PDA. No type good.
        On Sep 24, 2010, at 12:49 AM, "Ralph Castain"
        <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:

        Sent to both for reference (see below)

        Just to clarify. It wasn't a deadlock situation, but rather
        that the machine was overloaded and running so hard that the
        response to keystrokes was multiple seconds. Thus, there was
        no way to shut it down from the keyboard or screen. Even a
        ctrl-c was just getting ignored for a very long time due to
        the overload.

        I was running vmware on my machine, and doing a heavy
        compile/build in it. On top of this, I had email, editor, and
        browsers running - and then kicked off a fresh build in a
        terminal window. With Jeff's default settings, this latter
        build thought it would be running alone on the machine, and
        promptly generated a number of threads equal to all the
        processors. Since they were already loaded, this drove the
        machine into the ground.

        My point is just that it is unwise to assume that the OMPI
        build can utilize all available processors. I'm sure it's
        fine for the MTT runs, especially on Jeff's machines as they
        are dedicated to that purpose - just not a good general
        assumption.


        HTH
        Ralph

        ====================================
        Output of "perl -V":

        Summary of my perl5 (revision 5 version 8 subversion 9)
        configuration:
          Platform:
            osname=darwin, osvers=10.2.0, archname=darwin-2level
            uname='darwin sjc-rcastain-87111.cisco.com
        <http://sjc-rcastain-87111.cisco.com> 10.2.0 darwin kernel
        version 10.2.0: tue nov 3 10:37:10 pst 2009;
        root:xnu-1486.2.11~1release_i386 i386 '
            config_args='-des -D prefix=/opt/local -D
        scriptdir=/opt/local/bin -D cppflags=-I/opt/local/include -D
        ccflags=-O2 -arch x86_64 -D ldflags=-L/opt/local/lib -D
        vendorprefix=/opt/local -D man1ext=1pm -D man3ext=3pm -D
        cc=/usr/bin/gcc-4.2 -D ld=/usr/bin/gcc-4.2 -D
        man1dir=/opt/local/share/man/man1p -D
        man3dir=/opt/local/share/man/man3p -D
        siteman1dir=/opt/local/share/man/man1 -D
        siteman3dir=/opt/local/share/man/man3 -D
        vendorman1dir=/opt/local/share/man/man1 -D
        vendorman3dir=/opt/local/share/man/man3 -D
        inc_version_list=5.8.8 5.8.8/darwin-2level -U i_bind -U
        i_gdbm -U i_db'
            hint=recommended, useposix=true, d_sigaction=define
            usethreads=undef use5005threads=undef useithreads=undef
        usemultiplicity=undef
            useperlio=define d_sfio=undef uselargefiles=define
        usesocks=undef
            use64bitint=define use64bitall=define uselongdouble=undef
            usemymalloc=n, bincompat5005=undef
          Compiler:
            cc='/usr/bin/gcc-4.2', ccflags ='-O2 -arch x86_64
        -fno-common -DPERL_DARWIN -I/opt/local/include
        -no-cpp-precomp -fno-strict-aliasing -pipe
        -I/usr/local/include -I/opt/local/include',
            optimize='-O3',
            cppflags='-I/opt/local/include -no-cpp-precomp -O2 -arch
        x86_64 -fno-common -DPERL_DARWIN -I/opt/local/include
        -no-cpp-precomp -fno-strict-aliasing -pipe
        -I/usr/local/include -I/opt/local/include'
            ccversion='', gccversion='4.2.1 (Apple Inc. build 5646)
        (dot 1)', gccosandvers=''
            intsize=4, longsize=8, ptrsize=8, doublesize=8,
        byteorder=12345678
            d_longlong=define, longlongsize=8, d_longdbl=define,
        longdblsize=16
            ivtype='long', ivsize=8, nvtype='double', nvsize=8,
        Off_t='off_t', lseeksize=8
            alignbytes=8, prototype=define
          Linker and Libraries:
            ld='env MACOSX_DEPLOYMENT_TARGET=10.3 /usr/bin/gcc-4.2',
        ldflags ='-L/opt/local/lib -L/usr/local/lib'
            libpth=/usr/local/lib /opt/local/lib /usr/lib
            libs=-ldbm -ldl -lm -lutil -lc
            perllibs=-ldl -lm -lutil -lc
            libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false,
        libperl=libperl.a
            gnulibc_version=''
          Dynamic Linking:
            dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef,
        ccdlflags=' '
            cccdlflags=' ', lddlflags='-L/opt/local/lib -bundle
        -undefined dynamic_lookup -L/usr/local/lib'


Characteristics of this binary (from libperl): Compile-time options: PERL_MALLOC_WRAP USE_64_BIT_ALL
        USE_64_BIT_INT
                                USE_FAST_STDIO USE_LARGE_FILES USE_PERLIO
          Built under darwin
          Compiled at Feb 13 2010 13:19:33
          @INC:
            /opt/local/lib/perl5/site_perl/5.8.9/darwin-2level
            /opt/local/lib/perl5/site_perl/5.8.9
            /opt/local/lib/perl5/site_perl
            /opt/local/lib/perl5/vendor_perl/5.8.9/darwin-2level
            /opt/local/lib/perl5/vendor_perl/5.8.9
            /opt/local/lib/perl5/vendor_perl
            /opt/local/lib/perl5/5.8.9/darwin-2level
            /opt/local/lib/perl5/5.8.9
            .

        On Thu, Sep 23, 2010 at 10:26 PM, Ralf Wildenhues
        <ralf.wildenh...@gmx.de <mailto:ralf.wildenh...@gmx.de>> wrote:

            Hello Ralph,

            wow, that's not good to hear.  I knew the perl ithreads
            implementation
            wasn't all that efficient, but causing a deadlock sounds
            like you have
            more trouble than just perl; at least I hope so.  For
            reference, can
            you send 'perl -V' output (if you like, to the
            bug-automake at gnu.org <http://gnu.org>
            list).

            Thanks,
            Ralf

            * Ralph Castain wrote on Fri, Sep 24, 2010 at 03:12:16AM
            CEST:
            > I found one major negative to this change - it assumes
            that my build is
            > being done in exclusion of anything else on my
            computer. Unfortunately, this
            > is never true.
            >
            > So my laptop hemorrhaged itself into frozen silence,
            overheated to the point
            > of being burning hot, and had to have its battery
            yanked to stop the runaway
            > behavior. Not a really good thing.
            >
            > I would suggest you default this "heuristic" out, and
            let someone set it to
            > use multiple runs if-and-only-if they want it. Hate to
            cite the lowest
            > common denominator, but this was a very nasty surprise.
            >
            >
            >
            > On Wed, Sep 22, 2010 at 7:50 AM, Jeff Squyres
            <jsquy...@cisco.com <mailto:jsquy...@cisco.com>> wrote:
            >
            > > Some of you may be unaware that recent versions of
            automake can run in
            > > parallel.  That is, automake will run in parallel
            with a degree of (at most)
            > > $AUTOMAKE_JOBS.  This can speed up the execution time
            of autogen.pl <http://autogen.pl> quite
            > > a bit on some platforms.  On my cluster at cisco,
            here's a few quick timings
            > > of the entire autogen.pl <http://autogen.pl> process
            (of which, automake is the bottleneck):
            > >
            > > $AUTOMAKE_JOBS           Total wall time
            > > value                    of autogen.pl
            <http://autogen.pl>
            > > 8                        3:01.46
            > > 4                        2:55.57
            > > 2                        3:28.09
            > > 1                        4:38.44
            > >
            > > This is an older Xeon machine with 2 sockets, each
            with 2 cores.
            > >
            > > There's a nice performance jump from 1 to 2, and a
            smaller jump from 2 to
            > > 4.  4 and 8 are close enough to not matter.  YMMV.
            > >
            > > I just committed a heuristic to autogen.pl
            <http://autogen.pl> to setenv AUTOMAKE_JOBS if it
            > > is not already set
            (https://svn.open-mpi.org/trac/ompi/changeset/23788):
            > >
            > > - If lstopo is found in your $PATH, runs it and count
            how many PU's
            > > (processing units) you have.  It'll set AUTOMAKE_JOBS
            to that number, or a
            > > maximum of 4 (which is admittedly a further heuristic).
            > > - If lstopo is not found, it just sets AUTOMAKE_JOBS
            to 2.
            > >
            > > Enjoy.
            > >
            > > --
            > > Jeff Squyres
            > > jsquy...@cisco.com <mailto:jsquy...@cisco.com>
            > > For corporate legal information go to:
            > > http://www.cisco.com/web/about/doing_business/legal/cri/
            _______________________________________________
            devel mailing list
            de...@open-mpi.org <mailto:de...@open-mpi.org>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel


        _______________________________________________
        devel mailing list
        de...@open-mpi.org <mailto:de...@open-mpi.org>
        http://www.open-mpi.org/mailman/listinfo.cgi/devel

        _______________________________________________
        devel mailing list
        de...@open-mpi.org <mailto:de...@open-mpi.org>
        http://www.open-mpi.org/mailman/listinfo.cgi/devel



------------------------------------------------------------------------

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to