Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On 12/27/11 16:13, Ron McDowell wrote: Doug Barton wrote: The story so far ... sysinstall was removed from HEAD in October. I (and others) objected on the basis that at this time there is no replacement for the post-install configuration role that sysinstall played. More sysinstall components were then removed. Then the old version of libdialog (which sysinstall used) was removed. Thus at this point it's not possible to easily restore sysinstall. So my question is, how much do you care? Is lack of that functionality in HEAD something that we care about? Doug We have around 90 web servers running 8.2p5 right now [and yes, I did update the lot on Christmas Eve but that's a different story] and they will not be upgraded to 9.0 until/unless the post-install functionality that was lost by the removal of sysinstall is reintegrated in some way. I also complained about it and was told in effect, too bad. Everyone who commented said sysinstall caused more problems than it solved, although I've been using it for any system changes I needed that it was capable of doing for as long back as I can remember, and my first FreeBSD box was v2.2. I think removing any functionality that was in a previous release without providing an equal-or-better alternative is a bad idea, and that needs to be considered more carefully in the future. So this is not just a +1 vote, it's a +90. Sysintall is in 9 and will not be removed from the 9 branch. The installer used on the release media has changed, but as far as I understand, there is nothing stopping you from running sysinstall from a installer shell or using it for post installation configuration. Doug is only referring to the head branch (which will eventually in ~18-24 months become the 10 branch), so you should be able to have the best of both worlds with 9 i.e. try bsdinstall, fall back to sysinstall when you find bugs or missing features (don't forget to lodge bug reports for problems you find so that bsdinstall can be improved). On the topic of Doug's actual question, I see minimal sense in resurrecting sysinstall in head now. I would suggest it be done much closer to (say, 6 months before) the 10.0 release cycle, if no suitable post-installation configuration tool has materialised. In the meantime, cajole everyone who pops up saying I really want post installation configuration support to get involved with writing a bsdinstaller-like script (I think it should be completely separate to bsdinstaller, but perhaps use the same backend shell script functions/infrastructure) to do the job. Cheers, Lawrence ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
I think such a tool should /not/ be a port, since I expect it would include a package browser in it. I think it's something that could really help new users get used to FreeBSD without having to trawl through man pages right at the start. -- Bruce Cran Sent from my iPad On 27 Dec 2011, at 05:05, Doug Barton do...@freebsd.org wrote: On 12/26/2011 20:29, Xin LI wrote: On Mon, Dec 26, 2011 at 3:36 PM, Doug Barton do...@freebsd.org wrote: The story so far ... sysinstall was removed from HEAD in October. I (and others) objected on the basis that at this time there is no replacement for the post-install configuration role that sysinstall played. More sysinstall components were then removed. Then the old version of libdialog (which sysinstall used) was removed. Thus at this point it's not possible to easily restore sysinstall. So my question is, how much do you care? Is lack of that functionality in HEAD something that we care about? Perhaps make it a port instead? I personally don't use sysinstall for post-install tasks at all, but it won't hurt to have such functionality. You're not the first person to suggest that, but I don't see how it's actually responsive to the problem. This issue only affects HEAD, so a port would not be generally useful. It would also be an enormous amount of work to make it into a port. It would be much easier to revert the necessary changes to bring back the old libdialog and sysinstall itself. Doug -- [^L] Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. Possibly wrapped in an architecture-test where appropriate. - Philip -- Philip Paeps Senior Reality Engineer Ministry of Information ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On Tue Dec 27 11, Philip Paeps wrote: On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. Possibly wrapped in an architecture-test where appropriate. well there are a few issues here: 1) Jan Beich informed me via a private email that enclosing WERROR in arch specific conditions is a bad idea. if the code gets ported to a new architecture WERROR doesn't get set and so for every new architecture one has to evaluate again, whether WERROR needs to be set or not. so i'm inclined to agree with dim@ that we should not add architecture specific conditions -- however i think adding compiler specific conditions is a good idea. 2) the problem with settings WARNS= or specific -Wno-X or -Wno-error=X is that expecially GCC doesn't have specific -WX flags for certain warnings. some warnings are implied by -Wall and cannot be turned off seperately. so in order to get rid of these warnings (which are being handled as errors), we would need to disable -Wall. and i think setting WERROR= in order to handle all warnings for specific code as warnings rather than as errors is the better solution. i've reworked the patch to only remove WERROR=, where it is not needed anymore for any supported arch, or where it can be wrapped in a compiler condition. cheers. alex - Philip -- Philip Paeps Senior Reality Engineer Ministry of Information Index: sys/modules/xfs/Makefile === --- sys/modules/xfs/Makefile(revision 228911) +++ sys/modules/xfs/Makefile(working copy) @@ -6,8 +6,6 @@ KMOD= xfs -WERROR= - SRCS = vnode_if.h \ xfs_alloc.c \ xfs_alloc_btree.c \ Index: sys/modules/sound/driver/maestro/Makefile === --- sys/modules/sound/driver/maestro/Makefile (revision 228911) +++ sys/modules/sound/driver/maestro/Makefile (working copy) @@ -5,6 +5,5 @@ KMOD= snd_maestro SRCS= device_if.h bus_if.h pci_if.h SRCS+= maestro.c -WERROR= .include bsd.kmod.mk Index: sys/modules/aic7xxx/ahd/Makefile === --- sys/modules/aic7xxx/ahd/Makefile(revision 228911) +++ sys/modules/aic7xxx/ahd/Makefile(working copy) @@ -4,7 +4,6 @@ .PATH: ${.CURDIR}/../../../dev/aic7xxx KMOD= ahd -WERROR= GENSRCS= aic79xx_seq.h aic79xx_reg.h REG_PRINT_OPT= AHD_REG_PRETTY_PRINT=1 Index: sys/modules/agp/Makefile === --- sys/modules/agp/Makefile(revision 228911) +++ sys/modules/agp/Makefile(working copy) @@ -20,7 +20,6 @@ SRCS+= device_if.h bus_if.h agp_if.h pci_if.h SRCS+= opt_agp.h opt_bus.h MFILES=kern/device_if.m kern/bus_if.m dev/agp/agp_if.m dev/pci/pci_if.m -WERROR= EXPORT_SYMS= agp_find_device \ agp_state \ Index: sys/modules/bios/smapi/Makefile === --- sys/modules/bios/smapi/Makefile (revision 228911) +++ sys/modules/bios/smapi/Makefile (working copy) @@ -6,7 +6,6 @@ KMOD= smapi SRCS= smapi.c smapi_bios.S \ bus_if.h device_if.h -WERROR= .if ${CC:T:Mclang} == clang # XXX: clang integrated-as doesn't grok 16-bit assembly yet CFLAGS+= ${.IMPSRC:T:Msmapi_bios.S:C/^.+$/-no-integrated-as/} Index: sys/modules/nve/Makefile === --- sys/modules/nve/Makefile(revision 228911) +++ sys/modules/nve/Makefile(working copy) @@ -7,7 +7,9 @@ device_if.h bus_if.h pci_if.h miibus_if.h \ os+%DIKED-nve.h OBJS+= nvenetlib.o +.if ${CC:T:Mclang} == clang WERROR= +.endif CLEANFILES+= nvenetlib.o os+%DIKED-nve.h nvenetlib.o: ${.CURDIR}/../../contrib/dev/nve/${MACHINE}/${.TARGET}.bz2.uu ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On Tue, Dec 27, 2011 at 11:27:43AM +, Alexander Best wrote: On Tue Dec 27 11, Philip Paeps wrote: On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. Possibly wrapped in an architecture-test where appropriate. well there are a few issues here: 1) Jan Beich informed me via a private email that enclosing WERROR in arch specific conditions is a bad idea. if the code gets ported to a new architecture WERROR doesn't get set and so for every new architecture one has to evaluate again, whether WERROR needs to be set or not. so i'm inclined to agree with dim@ that we should not add architecture specific conditions -- however i think adding compiler specific conditions is a good idea. 2) the problem with settings WARNS= or specific -Wno-X or -Wno-error=X is that expecially GCC doesn't have specific -WX flags for certain warnings. some warnings are implied by -Wall and cannot be turned off seperately. so in order to get rid of these warnings (which are being handled as errors), we would need to disable -Wall. and i think setting WERROR= in order to handle all warnings for specific code as warnings rather than as errors is the better solution. i've reworked the patch to only remove WERROR=, where it is not needed anymore for any supported arch, or where it can be wrapped in a compiler condition. It seems to me that the removal of unnecessary WERROR= needed no discussion since day one so why don't you go ahead and commit it. I don't understand the comment on issue #1 above. There is a minuscule (six, before your patch ?) number of Makefiles with WERROR= . If you make the assignment architecture-specific, the worst it can happen is that the variable is not cleared, and if the build breaks, all you need is to add the extra architecture in these few places. cheers luigi cheers. alex - Philip -- Philip Paeps Senior Reality Engineer Ministry of Information Index: sys/modules/xfs/Makefile === --- sys/modules/xfs/Makefile (revision 228911) +++ sys/modules/xfs/Makefile (working copy) @@ -6,8 +6,6 @@ KMOD= xfs -WERROR= - SRCS = vnode_if.h \ xfs_alloc.c \ xfs_alloc_btree.c \ Index: sys/modules/sound/driver/maestro/Makefile === --- sys/modules/sound/driver/maestro/Makefile (revision 228911) +++ sys/modules/sound/driver/maestro/Makefile (working copy) @@ -5,6 +5,5 @@ KMOD=snd_maestro SRCS=device_if.h bus_if.h pci_if.h SRCS+= maestro.c -WERROR= .include bsd.kmod.mk Index: sys/modules/aic7xxx/ahd/Makefile === --- sys/modules/aic7xxx/ahd/Makefile (revision 228911) +++ sys/modules/aic7xxx/ahd/Makefile (working copy) @@ -4,7 +4,6 @@ .PATH: ${.CURDIR}/../../../dev/aic7xxx KMOD=ahd -WERROR= GENSRCS= aic79xx_seq.h aic79xx_reg.h REG_PRINT_OPT= AHD_REG_PRETTY_PRINT=1 Index: sys/modules/agp/Makefile === --- sys/modules/agp/Makefile (revision 228911) +++ sys/modules/agp/Makefile (working copy) @@ -20,7 +20,6 @@ SRCS+= device_if.h bus_if.h agp_if.h pci_if.h SRCS+= opt_agp.h opt_bus.h MFILES= kern/device_if.m kern/bus_if.m dev/agp/agp_if.m dev/pci/pci_if.m -WERROR= EXPORT_SYMS= agp_find_device \ agp_state \ Index: sys/modules/bios/smapi/Makefile === --- sys/modules/bios/smapi/Makefile (revision 228911) +++ sys/modules/bios/smapi/Makefile (working copy) @@ -6,7 +6,6 @@ KMOD=smapi SRCS=smapi.c smapi_bios.S \ bus_if.h device_if.h -WERROR= .if ${CC:T:Mclang} == clang # XXX: clang integrated-as doesn't grok 16-bit assembly yet CFLAGS+= ${.IMPSRC:T:Msmapi_bios.S:C/^.+$/-no-integrated-as/} Index: sys/modules/nve/Makefile === --- sys/modules/nve/Makefile (revision 228911) +++ sys/modules/nve/Makefile (working copy) @@ -7,7 +7,9 @@ device_if.h bus_if.h
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
Lawrence Stewart wrote: On 12/27/11 16:13, Ron McDowell wrote: Doug Barton wrote: The story so far ... sysinstall was removed from HEAD in October. I (and others) objected on the basis that at this time there is no replacement for the post-install configuration role that sysinstall played. More sysinstall components were then removed. Then the old version of libdialog (which sysinstall used) was removed. Thus at this point it's not possible to easily restore sysinstall. So my question is, how much do you care? Is lack of that functionality in HEAD something that we care about? Doug We have around 90 web servers running 8.2p5 right now [and yes, I did update the lot on Christmas Eve but that's a different story] and they will not be upgraded to 9.0 until/unless the post-install functionality that was lost by the removal of sysinstall is reintegrated in some way. I also complained about it and was told in effect, too bad. Everyone who commented said sysinstall caused more problems than it solved, although I've been using it for any system changes I needed that it was capable of doing for as long back as I can remember, and my first FreeBSD box was v2.2. I think removing any functionality that was in a previous release without providing an equal-or-better alternative is a bad idea, and that needs to be considered more carefully in the future. So this is not just a +1 vote, it's a +90. Sysintall is in 9 and will not be removed from the 9 branch. The installer used on the release media has changed, but as far as I understand, there is nothing stopping you from running sysinstall from a installer shell or using it for post installation configuration. You're right. I stand corrected and am happy to see I'll be able to upgrade to 9.0 after -RELEASE. Doug is only referring to the head branch (which will eventually in ~18-24 months become the 10 branch), so you should be able to have the best of both worlds with 9 i.e. try bsdinstall, fall back to sysinstall when you find bugs or missing features (don't forget to lodge bug reports for problems you find so that bsdinstall can be improved). On the topic of Doug's actual question, I see minimal sense in resurrecting sysinstall in head now. I would suggest it be done much closer to (say, 6 months before) the 10.0 release cycle, if no suitable post-installation configuration tool has materialised. In the meantime, cajole everyone who pops up saying I really want post installation configuration support to get involved with writing a bsdinstaller-like script (I think it should be completely separate to bsdinstaller, but perhaps use the same backend shell script functions/infrastructure) to do the job. I guess this is a good time for me to quit bitching, get off my butt, and contribute something back to a project I've been using daily for almost 20 years. Having done similar sysadm development work [way back] on Tandy Xenix, SCO Xenix/Unix, and Dell SVR4 Unix, this is an area where I actually might know enough to be useful. To that end, the first task I'm assigning myself is to poke around in bsdinstall/libdialog and see how they work. As a related question, is there a good primer somewhere about how to use SVN? I'm using csup at present. -- Ron McDowell San Antonio TX ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: re(4) driver dropping packets when reading NFS files
On Mon, Dec 26, 2011 at 10:55:50PM -0500, Rick Macklem wrote: Way back in Nov 2010, this thread was related to a problem I had, where an re(4) { 810xE PCIe 10/100baseTX, according to the driver } interface dropped received packets, resulting in a significant impact of NFS performance. Well, it turns out that a recent (post r224506) commit seems to have fixed the problem. It hasn't dropped any packets since I upgraded to a kernel with a r228281 version of if_re.c. So, good news. Thanks to those maintaining this driver, rick ps: If you have a need to know which commit fixed this, I can probably test variants to find out. Otherwise, I'm just happy that it's fixed.:-) Glad to know the issue was fixed. Probably the change made in r227593 or 227854 might have fixed it. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On Tue, Dec 27, 2011 at 11:50:54AM -0600, Ron McDowell wrote: As a related question, is there a good primer somewhere about how to use SVN? I'm using csup at present. http://wiki.freebsd.org/SubversionPrimer -- Steve ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dogfooding over in clusteradm land
On 14.12.11 14:20, Sean Bruno wrote: We're seeing what looks like a syncher/ufs resource starvation on 9.0 on the cvs2svn ports conversion box. I'm not sure what resource is tapped out. Effectively, I cannot access the directory under use and the converter application stalls out waiting for some resource that isn't clear. (Peter had posited kmem of some kind). I've upped maxvnodes a bit on the host, turned off SUJ and mounted the f/s in question with async and noatime for performance reasons. Can someone hit me up with the cluebat? I can give you direct access to the box for debuginationing. Just for the archives. This is fixed or at least considerably improved by r228838. The ports cvs2svn run went down from panicking after about ~22h to being finished after ~10h. Thanks to Sean and Attilio for giving me access to test boxes. Florian signature.asc Description: OpenPGP digital signature
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On Tue Dec 27 11, Luigi Rizzo wrote: On Tue, Dec 27, 2011 at 11:27:43AM +, Alexander Best wrote: On Tue Dec 27 11, Philip Paeps wrote: On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. Possibly wrapped in an architecture-test where appropriate. well there are a few issues here: 1) Jan Beich informed me via a private email that enclosing WERROR in arch specific conditions is a bad idea. if the code gets ported to a new architecture WERROR doesn't get set and so for every new architecture one has to evaluate again, whether WERROR needs to be set or not. so i'm inclined to agree with dim@ that we should not add architecture specific conditions -- however i think adding compiler specific conditions is a good idea. 2) the problem with settings WARNS= or specific -Wno-X or -Wno-error=X is that expecially GCC doesn't have specific -WX flags for certain warnings. some warnings are implied by -Wall and cannot be turned off seperately. so in order to get rid of these warnings (which are being handled as errors), we would need to disable -Wall. and i think setting WERROR= in order to handle all warnings for specific code as warnings rather than as errors is the better solution. i've reworked the patch to only remove WERROR=, where it is not needed anymore for any supported arch, or where it can be wrapped in a compiler condition. It seems to me that the removal of unnecessary WERROR= needed no discussion since day one so why don't you go ahead and commit it. anybody is free to commit this part, since i don't own a commit bit. ;) I don't understand the comment on issue #1 above. There is a minuscule (six, before your patch ?) number of Makefiles with WERROR= . If you make the assignment architecture-specific, the worst it can happen is that the variable is not cleared, and if the build breaks, all you need is to add the extra architecture in these few places. good point. basically the question with WERROR is: should it be a big hammer to disable turning warnings into errors for all archs or do we want to set WERROR in a more specific manor, where it's absolutely necessary. cheers. alex cheers luigi cheers. alex - Philip -- Philip Paeps Senior Reality Engineer Ministry of Information Index: sys/modules/xfs/Makefile === --- sys/modules/xfs/Makefile(revision 228911) +++ sys/modules/xfs/Makefile(working copy) @@ -6,8 +6,6 @@ KMOD= xfs -WERROR= - SRCS = vnode_if.h \ xfs_alloc.c \ xfs_alloc_btree.c \ Index: sys/modules/sound/driver/maestro/Makefile === --- sys/modules/sound/driver/maestro/Makefile (revision 228911) +++ sys/modules/sound/driver/maestro/Makefile (working copy) @@ -5,6 +5,5 @@ KMOD= snd_maestro SRCS= device_if.h bus_if.h pci_if.h SRCS+= maestro.c -WERROR= .include bsd.kmod.mk Index: sys/modules/aic7xxx/ahd/Makefile === --- sys/modules/aic7xxx/ahd/Makefile(revision 228911) +++ sys/modules/aic7xxx/ahd/Makefile(working copy) @@ -4,7 +4,6 @@ .PATH: ${.CURDIR}/../../../dev/aic7xxx KMOD= ahd -WERROR= GENSRCS= aic79xx_seq.h aic79xx_reg.h REG_PRINT_OPT= AHD_REG_PRETTY_PRINT=1 Index: sys/modules/agp/Makefile === --- sys/modules/agp/Makefile(revision 228911) +++ sys/modules/agp/Makefile(working copy) @@ -20,7 +20,6 @@ SRCS+= device_if.h bus_if.h agp_if.h pci_if.h SRCS+= opt_agp.h opt_bus.h MFILES=kern/device_if.m kern/bus_if.m dev/agp/agp_if.m dev/pci/pci_if.m -WERROR= EXPORT_SYMS= agp_find_device \ agp_state \ Index: sys/modules/bios/smapi/Makefile === --- sys/modules/bios/smapi/Makefile (revision 228911) +++ sys/modules/bios/smapi/Makefile (working copy) @@
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On 12/27/2011 03:48, Lawrence Stewart wrote: On the topic of Doug's actual question, I see minimal sense in resurrecting sysinstall in head now. I would suggest it be done much closer to (say, 6 months before) the 10.0 release cycle, if no suitable post-installation configuration tool has materialised. My concern about that approach is that 9.0 hasn't even been released yet and we've already seen changes that are going to make it hard to resurrect sysinstall if that's the decision we come to. Waiting another year or 2 would make it impossible. Doug -- You can observe a lot just by watching. -- Yogi Berra Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On Tue, 27 Dec 2011, Ron McDowell wrote: As a related question, is there a good primer somewhere about how to use SVN? I'm using csup at present. - Install the subversion port - Downlaod the source. To get HEAD code: svn co svn://svn.freebsd.org/base/head/ or to get 9-stable code: svn co svn://svn.freebsd.org/base/stable/9 (If you want to check it out into a different directory, append the dir name, for example: svn co svn://svn.freebsd.org/base/head/ src) - Make your changes :) - To get a diff of your changes, you can just use svn diff Gavin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On Dec 26, 2011, at 6:04 PM, Philip Paeps wrote: On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. Agreed... Possibly wrapped in an architecture-test where appropriate. Not so much... When you make architecture-specific tests, experience has shown that we don't fix bugs and they languish for a long time. Many times, these warnings are real. Sadly, we've found no way to tag the ones that aren't real yet as safe to ignore... Warner ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
SU+J systems do not fsck themselves
I've had multiple machines now (9.0-RC3, amd64, i386 and earlier 9-CURRENT on ppc) running SU+J that have had unexplained panics and crashes start happening relating to disk I/O. When I end up running a full fsck, it keeps turning out that the disk is dirty and corrupted, but no mechanism is in place with SU+J to detect and fix this. A bgfsck never happens, but a manual fsck in single-user does indeed fix the crashing and weird behavior. Others have tested their SU+J volumes and found them to have errors as well. This makes me super nervous. Basically, the way SU+J seems to operate is this: http://redundancy.redundancy.org/fscklog2 Oh hey, I see you shut down uncleanly, let's check everything looks good, off you go, whee Until I actually go and fsck, when I get: http://redundancy.redundancy.org/fscklog1 So, I understand that journalling doesn't replace the need for a potential fsck (though I never had this problem with gjournal), but without a way for the system to detect that a fsck is necessary, this seems pretty much a guaranteed recipe for data corruption, and seems to offer little to no benefit over plain SU+fsck, or even just mounting async. So: is everyone else seeing this? Am I misunderstanding how SU+J should be used? How should the error resolution process really happen? Thanks, David ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
fetch reading one char at a time
I noted the following behaviour from fetch today.. I am actually hunting another problem so I'm just posting it here in case anyone recognises it and knows where to fix it... d 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte s 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte u 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte c 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte c 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte e 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte s 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte s 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte f 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte u 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte l 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte . 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte \r 48885 fetchRET read 1 48885 fetchCALL gettimeofday(0x7fffcda0,0) 48885 fetchRET gettimeofday 0 48885 fetchCALL read(0x3,0x7fffce0f,0x1) 48885 fetchGIO fd 3 read 1 byte ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 1:53 PM, David Thiel l...@redundancy.redundancy.org wrote: I've had multiple machines now (9.0-RC3, amd64, i386 and earlier 9-CURRENT on ppc) running SU+J that have had unexplained panics and crashes start happening relating to disk I/O. When I end up running a full fsck, it keeps turning out that the disk is dirty and corrupted, but no mechanism is in place with SU+J to detect and fix this. A bgfsck never happens, but a manual fsck in single-user does indeed fix the crashing and weird behavior. Others have tested their SU+J volumes and found them to have errors as well. This makes me super nervous. Basically, the way SU+J seems to operate is this: http://redundancy.redundancy.org/fscklog2 Oh hey, I see you shut down uncleanly, let's check everything looks good, off you go, whee Until I actually go and fsck, when I get: http://redundancy.redundancy.org/fscklog1 So, I understand that journalling doesn't replace the need for a potential fsck (though I never had this problem with gjournal), but without a way for the system to detect that a fsck is necessary, this seems pretty much a guaranteed recipe for data corruption, and seems to offer little to no benefit over plain SU+fsck, or even just mounting async. So: is everyone else seeing this? Am I misunderstanding how SU+J should be used? How should the error resolution process really happen? I'm not sure if your experiments are right here, the second log shows you're running it read-only, which is likely caused by running it on live file system. What I would suggest to do is: - Reset the system while it's running; - Boot into single user mode; - 'dd' the disk image to an image; - Boot the system normally and: - use mdconfig -a -t vnode -f on copy of the image - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. This would rule out possible after-mount introduced changes, etc. I personally did not hit problems a few months ago but I didn't re-test recently. Cheers, -- Xin LI delp...@delphij.net https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 02:29:03PM -0800, Xin LI wrote: I'm not sure if your experiments are right here, the second log shows you're running it read-only, which is likely caused by running it on live file system. Yes, this most recent instance is me running it on a live FS, because I'm using that machine to type this right now. :) However, I've had the issues fixed in single-user on other systems and had the problems go away. At least for a bit. - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. When you say use journalled fsck, what's the proper way to initiate that? I don't see any journal-related options in the man page. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/27/11 14:36, David Thiel wrote: On Tue, Dec 27, 2011 at 02:29:03PM -0800, Xin LI wrote: I'm not sure if your experiments are right here, the second log shows you're running it read-only, which is likely caused by running it on live file system. Yes, this most recent instance is me running it on a live FS, because I'm using that machine to type this right now. :) However, I've had the issues fixed in single-user on other systems and had the problems go away. At least for a bit. - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. When you say use journalled fsck, what's the proper way to initiate that? I don't see any journal-related options in the man page. fsck -p perhaps? IIRC the fsck_ufs(8) would use journal if it's available and up-to-date. Cheers, - -- Xin LI delp...@delphij.nethttps://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.18 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk76S04ACgkQOfuToMruuMChEACfXyh1Y7IGiATqJdnFKeuIS2vB vJMAn0gCPy98kohAh3LD9ieIASPmksHd =L7lN -END PGP SIGNATURE- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On Tue Dec 27 11, Warner Losh wrote: On Dec 26, 2011, at 6:04 PM, Philip Paeps wrote: On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. Agreed... in this case it would have to be WARNS=1 then, because anything 1 will enable -Wall, which is the warning that breaks sys/modules/ie. cheers. alex Possibly wrapped in an architecture-test where appropriate. Not so much... When you make architecture-specific tests, experience has shown that we don't fix bugs and they languish for a long time. Many times, these warnings are real. Sadly, we've found no way to tag the ones that aren't real yet as safe to ignore... Warner ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [rfc] removing/conditionalising WERROR= in Makefiles
On 2011-12-27 02:04, Philip Paeps wrote: On 2011-12-26 10:10:40 (+), Alexander Bestarun...@freebsd.org wrote: i grep'ed through src/sys and found several places where WERROR= was set in order to get rid of the default -Werror setting. i tried to remove those WERROR= overrides from any Makefile, where doing so did not break tinderbox. in those cases, where it couldn't be completely removed, i added conditions to only set WERROR= for the particular achitecture or compiler, where tinderbox did not suceed without the WERROR=. Wouldn't it be better to set WARNS=x rather than WERROR=? WERROR= says this code has bugs, it breaks tinderbox whereas WARNS=x says this code has the following kind of bugs which break tinderbox. In my opinion, WERROR= says: there are warnings in this code which cannot be fixed right now, due to varying reasons, but we don't want to muffle them entirely, so somebody will eventually fix them in the future (or just delete the code, if it is unmaintained, or unmaintainable, like nve). If you set WARNS to a low level, you can be sure nobody ever sees the warnings, and they will never be fixed. That may be appropriate in some cases, but not the ones I just added WERROR= to. Those are just crufty drivers, that nobody wants to burn their fingers on. :) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On 12/28/11 06:29, Doug Barton wrote: On 12/27/2011 03:48, Lawrence Stewart wrote: On the topic of Doug's actual question, I see minimal sense in resurrecting sysinstall in head now. I would suggest it be done much closer to (say, 6 months before) the 10.0 release cycle, if no suitable post-installation configuration tool has materialised. My concern about that approach is that 9.0 hasn't even been released yet and we've already seen changes that are going to make it hard to resurrect sysinstall if that's the decision we come to. Waiting another year or 2 would make it impossible. Which changes are you referring to? I would have thought a reverse merge to undo the deletion of the sysinstall and old libdialog sources would be very minimal work. We'd also probably need a few extra build system changes to make sure old libdialog is perhaps statically compiled into sysinstall as it would be the only in-tree consumer, but that's not hard either. I may be lacking some imagination, but don't really see why it would become harder the longer we wait. Cheers, Lawrence ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On 12/27/2011 18:32, Lawrence Stewart wrote: On 12/28/11 06:29, Doug Barton wrote: On 12/27/2011 03:48, Lawrence Stewart wrote: On the topic of Doug's actual question, I see minimal sense in resurrecting sysinstall in head now. I would suggest it be done much closer to (say, 6 months before) the 10.0 release cycle, if no suitable post-installation configuration tool has materialised. My concern about that approach is that 9.0 hasn't even been released yet and we've already seen changes that are going to make it hard to resurrect sysinstall if that's the decision we come to. Waiting another year or 2 would make it impossible. Which changes are you referring to? I would have thought a reverse merge to undo the deletion of the sysinstall and old libdialog sources would be very minimal work. Then I admire your mad skillz, because it sounds like a lot of work to me. :) We'd also probably need a few extra build system changes to make sure old libdialog is perhaps statically compiled into sysinstall as it would be the only in-tree consumer, but that's not hard either. I may be lacking some imagination, but don't really see why it would become harder the longer we wait. My concern is that it's going to get worse as time goes along. Without sysinstall in the base people are going to feel free to make changes to things that sysinstall depends on (as they have already), and waiting a year or 2 to resurrect it will cause that problem to grow exponentially. Doug -- You can observe a lot just by watching. -- Yogi Berra Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 02:48:22PM -0800, Xin Li wrote: - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. Ok, here is the log of fsck with and without journal. http://redundancy.redundancy.org/fscklog3 That was done the very next boot, after a clean shutdown. The errors from the previous live fsck aren't there (oddly), but there are still are apparently some corrections made. The next fsck still complains, but doesn't give any salvage prompts. Here is jsa@'s, done on a live FS with SU+J: http://redundancy.redundancy.org/fscklog4 I'm not actually looking to solve my particular problem per se. The issue is that almost everyone I've checked with that's running SU+J gets unref'd file and other errors when they check their filesystem (with the fs live). Unless I'm missing something, a running FS should never have those kinds of errors unless you deliberately disabled fsck. This leaves only a couple options: - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should - Stuff is getting completely screwed up after boot - fsck is giving incorrect results - I'm completely clueless about how SU+J is supposed to behave or be deployed I'm pretty certain that the first is the issue here. It would be great if others could check their own SU+J filesystems so we could get a few more data points. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
Hi, Why not just list the things that sysinstall did that people like, and extract out / reimplement those bits? Noone's going to complain if you write say, a stand-alone package browser, or a stand-alone gui upgrade tool, or stand-alone configuration program, etc. Adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool
On 12/27/2011 22:08, Adrian Chadd wrote: Hi, Why not just list the things that sysinstall did that people like, and extract out / reimplement those bits? That's sounds great. As soon as that's done, we can remove sysinstall from the base. Until those things exist, removing it is premature. -- You can observe a lot just by watching. -- Yogi Berra Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Dec 27, 2011, at 10:14 PM, David Thiel wrote: On Tue, Dec 27, 2011 at 02:48:22PM -0800, Xin Li wrote: - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. Ok, here is the log of fsck with and without journal. http://redundancy.redundancy.org/fscklog3 The first run of fsck, using the journal, gives results that I would expect. The second run seems to imply that the fixes made on the first run didn't actually get written to disk. This is definitely an oddity. I see that you're using geli, maybe there's some strange side-effect there. No idea. Report as a bug, this is definitely undesired behavior. That was done the very next boot, after a clean shutdown. The errors from the previous live fsck aren't there (oddly), but there are still are apparently some corrections made. The next fsck still complains, but doesn't give any salvage prompts. Here is jsa@'s, done on a live FS with SU+J: http://redundancy.redundancy.org/fscklog4 For the love that is all good and holy, don't ever run fsck on a live filesystem. It's going to report these kinds of problems! It's normal; filesystem metadata updates stay cached in memory, and fsck bypasses that cache. Also, what you see in your log is a file that has been unlinked but held open. This is a common Unix idiom, and one that gets cleaned up by fsck on reboot, whether through the SUJ intent log processing or through a traditional fsck. I'm not actually looking to solve my particular problem per se. The issue is that almost everyone I've checked with that's running SU+J gets unref'd file and other errors when they check their filesystem (with the fs live). Unless I'm missing something, a running FS should never have those kinds of errors unless you deliberately disabled fsck. Nope, you are completely incorrect here. This leaves only a couple options: - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should The point of SUJ is to eliminate the need for bgfsck. Effectively, they are exclusive ideas. It's possible that there are still problems with SUJ and how fsck processes and commits the journal entires. However, bgfsck has nothing to do with this, and I'd also like to know if your use of geli is complicating the problem. - Stuff is getting completely screwed up after boot Possibly but unlikely - fsck is giving incorrect results Very unlikely - I'm completely clueless about how SU+J is supposed to behave or be deployed No comment =-) I'm pretty certain that the first is the issue here. It would be great if others could check their own SU+J filesystems so we could get a few more data points. Indeed, more data is needed. Scott ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 11:54:20PM -0700, Scott Long wrote: The first run of fsck, using the journal, gives results that I would expect. The second run seems to imply that the fixes made on the first run didn't actually get written to disk. This is definitely an oddity. I see that you're using geli, maybe there's some strange side-effect there. No idea. Report as a bug, this is definitely undesired behavior. Not impossible, but I was seeing similar issues on two non-geli systems as well, i.e. tons of errors fixed when doing a single-user non-journalled fsck, but journalled fsck not fixing stuff. I'll try to replicate on a test machine, as I already lost data on the last (non-geli) machine this happened to. For the love that is all good and holy, don't ever run fsck on a live filesystem. It's going to report these kinds of problems! It's normal; filesystem metadata updates stay cached in memory, and fsck bypasses that cache. Ok. I expected fsck would be softupdate-aware in that way, but I understand it not doing so. - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should The point of SUJ is to eliminate the need for bgfsck. Effectively, they are exclusive ideas. This is surprising to me. It is my impression that under Linux at least, ext3fs is checked against the journal, and gets a full e2fsck if it finds it's still dirty. Additionally, there's a periodic fsck after 180 days continuous runtime or x number of mounts (see tune2fs -i and -c). Is SU+J somehow implemented in such a way that this is unnecessary? What does it do that the ext3fs people have missed? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Dec 28, 2011, at 12:34 AM, David Thiel wrote: On Tue, Dec 27, 2011 at 11:54:20PM -0700, Scott Long wrote: The first run of fsck, using the journal, gives results that I would expect. The second run seems to imply that the fixes made on the first run didn't actually get written to disk. This is definitely an oddity. I see that you're using geli, maybe there's some strange side-effect there. No idea. Report as a bug, this is definitely undesired behavior. Not impossible, but I was seeing similar issues on two non-geli systems as well, i.e. tons of errors fixed when doing a single-user non-journalled fsck, but journalled fsck not fixing stuff. I'll try to replicate on a test machine, as I already lost data on the last (non-geli) machine this happened to. For the love that is all good and holy, don't ever run fsck on a live filesystem. It's going to report these kinds of problems! It's normal; filesystem metadata updates stay cached in memory, and fsck bypasses that cache. Ok. I expected fsck would be softupdate-aware in that way, but I understand it not doing so. - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should The point of SUJ is to eliminate the need for bgfsck. Effectively, they are exclusive ideas. This is surprising to me. It is my impression that under Linux at least, ext3fs is checked against the journal, and gets a full e2fsck if it finds it's still dirty. Additionally, there's a periodic fsck after 180 days continuous runtime or x number of mounts (see tune2fs -i and -c). Is SU+J somehow implemented in such a way that this is unnecessary? What does it do that the ext3fs people have missed? SUJ isn't like ext3 journaling, it doesn't do 100% metadata logging. Instead, it's an extension of softupdates. Softupdates (SU) is still responsible for ordering dependent writes to the disk to maintain consistency. What SU can't handle is the Unix/POSIX idiom of unlinking a file from the namespace but keeping its inode active through refcounts. When you have an unclean shutdown, you wind up with stale blocks allocated to orphaned inodes. The point of bgfsck was to scan the filesystem for these allocations and free them, just like fsck does, but to do it in the background so that the boot could continue. SUJ is basically just an intent log for this case; it tells fsck where to find these allocations so that fsck doesn't have to do the lengthy scan. FWIW, this problem is present in most any journaling implementation and is usually solved via the use of intent records in a journal, not unlike SUJ. So, there's an assumption with SUJ+fsck that SU is keeping the filesystem consistent. Maybe that's a bad assumption, and I'm not trying to discredit your report. But the intention with SUJ is to eliminate the need for anything more than a cursory check of the superblocks and a processing of the SUJ intent log. If either of these fails then fsck reverts to a traditional scan. In the same vein, ext3 and most other traditional journaling filesystems assume that the journal is correct and is preserving consistency, and don't do anything more than a cursory data structure scan and journal replay as well, but then revert to a full scan if that fails (zfs seems to be an exception here, with there being no actual fsck available for it). As for the 180 day forced scan on ext3, I have no public comment. SU has matured nicely over the last 10+ years, and I'm happy with the progress that SUJ has made in the last 2-3 years. If there are bugs, they need to be exposed and addressed ASAP. Scott ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org