Very interesting. Don't know if it's the same problem, but I noted an issue quite a while ago where make -jN all/install would fail when traversing opal. I built a workaround that was just a script that does make all in opal, then goes back to make -jN for orte/ompi.
Perhaps this would fix that problem too.... Thanks Ralf! On 6/3/08 3:53 PM, "Ralf Wildenhues" <ralf.wildenh...@gmx.de> wrote: > Hi Jeff, > > * Jeff Squyres wrote on Tue, Jun 03, 2008 at 11:11:32PM CEST: >> ERROR: Command returned a non-zero exist status >> make -j 4 distcheck > [...] >> Making install in etc >> make[3]: Entering directory >> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_build/opal/etc' >> make[4]: Entering directory >> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_build/opal/etc' >> test -z >> "/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_inst/etc" || /bin/mkdir -p >> "/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_inst/etc" >> /usr/bin/install -c -m 644 ../../../opal/etc/openmpi-mca-params.conf >> /home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/ope >> nmpi-1.3a1r18551/_inst/etc/openmpi-mca-params.conf >> /usr/bin/install: cannot create regular file >> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_inst/etc/openmpi-mca-params.conf': No such file or >> directory >> make[4]: *** [install-data-local] Error 1 >> make[4]: *** Waiting for unfinished jobs.... >> make[4]: *** Waiting for unfinished jobs.... >> make[4]: Leaving directory >> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_build/opal/etc' >> make[3]: *** [install-am] Error 2 >> make[3]: Leaving directory >> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op >> enmpi-1.3a1r18551/_build/opal/etc' >> make[2]: *** [install-recursive] Error 1 > > Nice clue, thanks. This is a bug in opal/etc/Makefile.am: > > --- quote opal/etc/Makefile.am --- > # This has to be here, even though it's empty, so that AM thinks that > # "something" will happen here (details fuzzy, but we remember that this > # *needs* to be here -- you have been warned). > > sysconf_DATA = > > # Steal a little trickery from a generated Makefile to only install > # files if they do not already exist at the target. > > install-data-local: > @ p="$(opal_config_files)"; \ > for file in $$p; do \ > if test -f $(DESTDIR)$(sysconfdir)/$$file; then \ > echo "******************************* WARNING > ************************************"; \ > echo "*** Not installing new $$file over existing file in:"; \ > echo "*** $(DESTDIR)$(sysconfdir)/$$file"; \ > echo "******************************* WARNING > ************************************"; \ > else \ > if test -f "$$file"; then d=; else d="$(srcdir)/"; fi; \ > f="`echo $$file | sed -e 's|^.*/||'`"; \ > echo " $(INSTALL_DATA) $$d$$file $(DESTDIR)$(sysconfdir)/$$f"; \ > $(INSTALL_DATA) $$d$$file $(DESTDIR)$(sysconfdir)/$$f; \ > fi; \ > done > --- snip --- > > To clarify the mysterious comment above, the "sysconf_DATA =" line > causes automake to emit an undocumented target install-sysconfDATA which > effectively runs something like > mkdir -p $(DESTDIR)$(sysconfdir) > > and then installs zero files there. The install-data-local rule is also > updated as a dependency of 'install', just like install-sysconfDATA, > however there exists no dependency relation between the two. Which > means that with parallel make, they can be run concurrently, which I > assume is what happened in your case; although the log shows them in the > right order, it can still happen that mkdir wasn't done with its work > before install-data-local accessed the directory. > > An easy fix is to use install-data-hook instead, which is documented to > run after the normal install rules; or to generate the directory in the > install-data-local rule itself, and drop the sysconf_DATA line. > > Proposed, untested patch below. > > I have not checked whether there are more instances of this in OMPI. > > Cheers, > Ralf > > Fix race condition in 'make install': let install-data-local > create $(sysconfdir), rather than an automake-generated rule > which may be run in parallel (with make -j). > > Index: opal/etc/Makefile.am > =================================================================== > --- opal/etc/Makefile.am (Revision 17766) > +++ opal/etc/Makefile.am (Arbeitskopie) > @@ -23,16 +23,11 @@ > > EXTRA_DIST = $(opal_config_files) > > -# This has to be here, even though it's empty, so that AM thinks that > -# "something" will happen here (details fuzzy, but we remember that this > -# *needs* to be here -- you have been warned). > - > -sysconf_DATA = > - > # Steal a little trickery from a generated Makefile to only install > # files if they do not already exist at the target. > > install-data-local: > + $(mkdir_p) $(DESTDIR)$(sysconfdir) > @ p="$(opal_config_files)"; \ > for file in $$p; do \ > if test -f $(DESTDIR)$(sysconfdir)/$$file; then \ > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel