Re: Buildworld /rescue failures in 5.1
At 12:12 PM -0700 7/24/03, Tim Kientzle wrote: Garance A Drosihn wrote: So indeed, that 'make depend' had not finished before the 'make' for the object had started. There's another possibility here: suppose two copies of make are running simultaneously and both get to this sequence at about the same time: tar_make: (cd $(tar_SRCDIR) && \ $(MAKE) $(BUILDOPTS) $(tar_OPTS) depend &&\ $(MAKE) $(BUILDOPTS) $(tar_OPTS) $(tar_OBJS)) The first make to run this will start building dependencies. The second copy will see that ".depend" already exists (note that bsd.dep.mk builds .depend incrementally) and then go on to the next step. I am still not exactly sure what is going on here, but it looks like Gordon has committed a change which has solved the problem which I kept running into. It's a little tricky to figure out exactly what is going on, since the problem so dependent upon the exact timing of the events. However, I would note that in at least some of my testing, the .depend file did *not* exist -- not at all -- in the directory that it needed to be in. Still, it does sound like a good idea to make the creation of .depend to be an atomic operation. I might prefer to use the 'mktemp' command, instead of adding a PID. Something along the lines of: DEPENDTMP=`mktemp ${DEPENDFILE}.X` -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
Garance A Drosihn wrote: Wed Jul 23 20:08:06 EDT 2003 Starting make depend in /usr/obj/usr/src/rescue/rescue/usr/src/gnu/usr.bin/gzip Wed Jul 23 20:08:07 EDT 2003 Finished make depend in /usr/obj/usr/src/rescue/rescue/usr/src/gnu/usr.bin/gzip Wed Jul 23 20:08:09 EDT 2003 Starting make depend in /usr/obj/usr/src/rescue/rescue/usr/src/gnu/usr.bin/tar So indeed, that 'make depend' had not finished before the 'make' for the object had started. There's another possibility here: suppose two copies of make are running simultaneously and both get to this sequence at about the same time: tar_make: (cd $(tar_SRCDIR) && \ $(MAKE) $(BUILDOPTS) $(tar_OPTS) depend &&\ $(MAKE) $(BUILDOPTS) $(tar_OPTS) $(tar_OBJS)) The first make to run this will start building dependencies. The second copy will see that ".depend" already exists (note that bsd.dep.mk builds .depend incrementally) and then go on to the next step. Depending on the exact timing, this could result in an attempt to build the object files with an incomplete '.depend' file. I wonder if something like the attached patch (which causes .depend to be created atomically) affects things? Tim Index: share/mk/bsd.dep.mk === RCS file: /usr/cvs/FreeBSD-CVS/src/share/mk/bsd.dep.mk,v retrieving revision 1.41 diff -u -r1.41 bsd.dep.mk --- share/mk/bsd.dep.mk 3 Jul 2003 11:43:57 - 1.41 +++ share/mk/bsd.dep.mk 24 Jul 2003 19:08:11 - @@ -112,25 +112,29 @@ # Different types of sources are compiled with slightly different flags. # Split up the sources, and filter out headers and non-applicable flags. +PID != /bin/sh -c 'echo ' + ${DEPENDFILE}: ${SRCS} - rm -f ${DEPENDFILE} + rm -f ${DEPENDFILE} ${DEPENDFILE}.${PID} .if ${SRCS:M*.[cS]} != "" - ${MKDEPCMD} -f ${DEPENDFILE} -a ${MKDEP} \ + ${MKDEPCMD} -f ${DEPENDFILE}.${PID} -a ${MKDEP} \ ${CFLAGS:M-nostdinc*} ${CFLAGS:M-[BID]*} \ ${.ALLSRC:M*.[cS]} .endif .if ${SRCS:M*.cc} != "" || ${SRCS:M*.C} != "" || ${SRCS:M*.cpp} != "" || \ ${SRCS:M*.cxx} != "" - ${MKDEPCMD} -f ${DEPENDFILE} -a ${MKDEP} \ + ${MKDEPCMD} -f ${DEPENDFILE}.${PID} -a ${MKDEP} \ ${CXXFLAGS:M-nostdinc*} ${CXXFLAGS:M-[BID]*} \ ${.ALLSRC:M*.cc} ${.ALLSRC:M*.C} ${.ALLSRC:M*.cpp} ${.ALLSRC:M*.cxx} .endif .if ${SRCS:M*.m} != "" - ${MKDEPCMD} -f ${DEPENDFILE} -a ${MKDEP} \ + ${MKDEPCMD} -f ${DEPENDFILE}.${PID} -a ${MKDEP} \ ${OBJCFLAGS:M-nostdinc*} ${OBJCFLAGS:M-[BID]*} \ ${OBJCFLAGS:M-Wno-import*} \ ${.ALLSRC:M*.m} .endif + mv ${DEPENDFILE}.${PID} ${DEPENDFILE} + rm -f ${DEPENDFILE}.${PID} .if target(_EXTRADEPEND) _EXTRADEPEND: .USE ${DEPENDFILE}: _EXTRADEPEND ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
At 12:44 AM -0700 7/24/03, Gordon Tetlow wrote: On Wed, Jul 23, 2003 at 10:13:20PM -0400, Garance A Drosihn wrote: > > I was going to do some debugging of what 'make' is doing, > but it looks like crunchgen gets confused if make has any > kind of debugging flags turned on. I just committed 1.14 of src/rescue/rescue/Makefile that should fix the -j build with rescue. Please let me know if it doesn't work. Otherwise, I'm heading to bed. Night. This has worked for me. I did a cvsup, and then did the same sequence of steps which has consistently failed. This time the buildworld completed with no errors. Nice job, Thanks! -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
On Wed, Jul 23, 2003 at 10:13:20PM -0400, Garance A Drosihn wrote: > At 8:14 PM -0400 7/23/03, Garance A Drosihn wrote: > > > >So indeed, that 'make depend' had not finished before > >the 'make' for the object had started. > > I was going to do some debugging of what 'make' is doing, but > it looks like crunchgen gets confused if make has any kind of > debugging flags turned on. However, I have to get back to my > real-job work before my manager clobbers me, so this is > probably as far as I'm going to take this for now. I just committed 1.14 of src/rescue/rescue/Makefile that should fix the -j build with rescue. Please let me know if it doesn't work. Otherwise, I'm heading to bed. Night. -gordon pgp0.pgp Description: PGP signature
Re: Buildworld /rescue failures in 5.1
At 8:14 PM -0400 7/23/03, Garance A Drosihn wrote: So indeed, that 'make depend' had not finished before the 'make' for the object had started. I was going to do some debugging of what 'make' is doing, but it looks like crunchgen gets confused if make has any kind of debugging flags turned on. However, I have to get back to my real-job work before my manager clobbers me, so this is probably as far as I'm going to take this for now. -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
At 4:44 PM -0700 7/23/03, Gordon Tetlow wrote: I don't see how this construct cannot be parallel make safe. The && requires that the third line check the result of the second before continuing. It doesn't make sense. Oops, my last reply got away from me before I was done... Anyway, I added some 'echo's to /usr/share/mk/bsd.dep.mk: beforedepend: echo "`date` Starting make depend in `pwd`" >> /tmp/buildrescue-log and afterdepend: echo "`date` Finished make depend in `pwd`" >> /tmp/buildrescue-log The make again failed with: make: don't know how to make /usr/obj/usr/src/rescue/rescue//usr/src/gnu/usr.bin/tar/addext.o. Stop And the last lines in the log were: Wed Jul 23 20:08:06 EDT 2003 Starting make depend in /usr/obj/usr/src/rescue/rescue/usr/src/gnu/usr.bin/gzip Wed Jul 23 20:08:07 EDT 2003 Finished make depend in /usr/obj/usr/src/rescue/rescue/usr/src/gnu/usr.bin/gzip Wed Jul 23 20:08:09 EDT 2003 Starting make depend in /usr/obj/usr/src/rescue/rescue/usr/src/gnu/usr.bin/tar So indeed, that 'make depend' had not finished before the 'make' for the object had started. -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
At 4:44 PM -0700 7/23/03, Gordon Tetlow wrote: On Wed, Jul 23, 2003, Garance A Drosihn wrote: > > The .depend file is apparently created by > /usr/obj/usr/src/rescue/rescue/rescue.mk > and that in turn says it is generated from rescue.conf by crunchgen 0.2. The rescue.mk file includes the rule: tar_make: (cd $(tar_SRCDIR) && \ $(MAKE) $(BUILDOPTS) $(tar_OPTS) depend &&\ $(MAKE) $(BUILDOPTS) $(tar_OPTS) $(tar_OBJS)) and my guess is that construct is not '-j' safe. I have no idea how to fix that, or even if I'm on the right track, but perhaps the above will be useful to someone who understands parallel makes more than I do... I don't see how this construct cannot be parallel make safe. The && requires that the third line check the result of the second before continuing. It doesn't make sense. Yeah, I don't know how these pieces all come together (or don't come together, as the case may be). Nevertheless, it is true that make is apparently trying a 'make addext.o' before that .depend file exists. Perhaps this is a bug, or maybe I'm just barking up the wrong tree... I'm going to try a few more tests, and see if I can make some sense out of this. Given that 'make buildworld' is going to effectively do: cd /usr/src/rescue make obj [...other stuff...] cd /usr/src/rescue make includes [...other stuff...] cd /usr/src/rescue make depend [...other stuff...] it would be nice if *that* 'make depend' could result in all of these .depend files being created. That is clearly not the case at the moment. -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
On Wed, Jul 23, 2003 at 07:41:18PM -0400, Garance A Drosihn wrote: > > So it is easy to image that this .depend file is crucial to > successfully making addext.o. > > The .depend file is apparently created by > /usr/obj/usr/src/rescue/rescue/rescue.mk > > and that in turn says it is generated from rescue.conf > by crunchgen 0.2. The rescue.mk file includes the rule: > > tar_make: > (cd $(tar_SRCDIR) && \ > $(MAKE) $(BUILDOPTS) $(tar_OPTS) depend &&\ > $(MAKE) $(BUILDOPTS) $(tar_OPTS) $(tar_OBJS)) > > and my guess is that construct is not '-j' safe. > > I have no idea how to fix that, or even if I'm on the right > track, but perhaps the above will be useful to someone who > understands parallel makes more than I do... I don't see how this construct cannot be parallel make safe. The && requires that the third line check the result of the second before continuing. It doesn't make sense. -gordon pgp0.pgp Description: PGP signature
Re: Buildworld /rescue failures in 5.1
At 6:41 PM -0400 7/23/03, Garance A Drosihn wrote: Where that error is: make: don't know how to make /usr/obj/usr/src/rescue/rescue//usr/src/sbin/dhclient/client/clparse.o. Stop *** Error code 2 1 error *** Error code 2 1 error Well, that isn't always the error message, but it's always something similar to that. Now that it takes me only a few minutes to test things, I think I'm making some kind of headway. I redid the making of rescue with '-j2', and got the error make: don't know how to make /usr/obj/usr/src/rescue/rescue//usr/src/gnu/usr.bin/tar/addext.o. Stop I then compared the files in the directory /usr/obj/usr/src/.../usr.bin/tar between the attempt which worked, and the attempt which failed. The attempt which worked had a '.depend' file which did not exist in the attempt which failed. Ie, make trying to build addext.o before the .depend file has shown up for anything in 'tar'. In the attempt which works, that .depend file includes: addext.o: /usr/src/contrib/tar/lib/addext.c \ /usr/src/gnu/usr.bin/tar/config.h /usr/include/paths.h \ /usr/include/sys/cdefs.h /usr/include/limits.h \ /usr/include/sys/limits.h /usr/include/machine/_limits.h \ /usr/include/sys/syslimits.h /usr/include/sys/types.h \ /usr/include/machine/endian.h /usr/include/sys/_types.h \ /usr/include/machine/_types.h /usr/include/sys/select.h \ /usr/include/sys/_sigset.h /usr/include/sys/_timeval.h \ /usr/include/sys/timespec.h /usr/include/string.h \ /usr/include/strings.h /usr/include/unistd.h /usr/include/sys/unistd.h \ /usr/include/errno.h /usr/src/contrib/tar/lib/backupfile.h \ /usr/src/contrib/tar/lib/dirname.h So it is easy to image that this .depend file is crucial to successfully making addext.o. The .depend file is apparently created by /usr/obj/usr/src/rescue/rescue/rescue.mk and that in turn says it is generated from rescue.conf by crunchgen 0.2. The rescue.mk file includes the rule: tar_make: (cd $(tar_SRCDIR) && \ $(MAKE) $(BUILDOPTS) $(tar_OPTS) depend &&\ $(MAKE) $(BUILDOPTS) $(tar_OPTS) $(tar_OBJS)) and my guess is that construct is not '-j' safe. I have no idea how to fix that, or even if I'm on the right track, but perhaps the above will be useful to someone who understands parallel makes more than I do... -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Buildworld /rescue failures in 5.1
I am not much of a makefile expert, but I have been trying various changes to see if I could fix the problem with building /rescue. On my system, a buildworld will always fail if I specify '-j'. It is time-consuming to try things, because it takes a while to do a whole buildworld. Today it occurred to me that I could probably make things go much faster if I didn't do the whole buildworld. And indeed, it turns out that I get the same 'make' error if I use: rm -Rf /usr/obj/usr/src/* cd /usr/src/rescue make -j5 obj make -j5 includes make -j5 depend make -j5 all Where that error is: make: don't know how to make /usr/obj/usr/src/rescue/rescue//usr/src/sbin/dhclient/client/clparse.o. Stop *** Error code 2 1 error *** Error code 2 1 error The nice thing about this is that it only takes three minutes to get to that error, instead of the seventy minutes that it takes when doing it as part of buildworld. And if I drop the '-j5', then the error does not come up. This also suggests I could probably get away with: cd /usr/src make -j5 -DNO_RESCUE buildworld cd /usr/src/rescue make obj includes && make depend && make all -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"