Re: svn commit: r328013 - head/sbin/fsck_ffs
On Sat, Mar 10, 2018 at 05:01:40PM -0500, David Bright wrote: > With regard to the fsck_ffs behavior being a regression because formerly the > FS would be mounted successfully: > > That was not my experience. What I observed was that the “fsck -y” would give > the “please re-run” message, exit with 0 status so the boot would continue, > the subsequent mount would fail because the filesystem was not clean, and > *then* the boot would stop and drop to single user. I think my problem is specific to SU without journaling. The UFS code allows one to mount an unclean filesystem in that configuration since SU guarantees that on-disk metadata is consistent. A background fsck takes care of leaked inodes and data blocks. (FWIW, I'm not using journaling only because makefs(8) doesn't support the creation of SU+J filesystems.) /dev/gpt/rootfs: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED) /dev/gpt/rootfs: SUMMARY INFORMATION BAD (SALVAGED) /dev/gpt/rootfs: BLK(S) MISSING IN BIT MAPS (SALVAGED) /dev/gpt/rootfs: 32664 files, 495447 used, 813272 free (176 frags, 203274 blocks, 0.0% fragmentation) * PLEASE RERUN FSCK * WARNING: /: reload pending error: blocks 192 files 3 Unknown error 16; help! ERROR: ABORTING BOOT (sending SIGTERM to parent)! Mar 10 12:47:50 init: /bin/sh on /etc/rc terminated abnormally, going to single user mode Enter full pathname of shell or RETURN for /bin/sh: # mount /dev/gpt/rootfs on / (ufs, local, read-only) devfs on /dev (devfs, local, multilabel) # mount -u -o rw / WARNING: / was not properly dismounted # echo $? 0 # mount /dev/gpt/rootfs on / (ufs, local, soft-updates) devfs on /dev (devfs, local, multilabel) ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
With regard to: fsck_y_flags="-T ffs:-R -T ufs:-R" # Additional flags for fsck -y I don’t know how, but I completely missed the -T option for fsck when I was investigating this issue. That would be very useful, although I wanted my solution to be applicable to file systems other than ffs/ufs. With regard to the fsck_ffs behavior being a regression because formerly the FS would be mounted successfully: That was not my experience. What I observed was that the “fsck -y” would give the “please re-run” message, exit with 0 status so the boot would continue, the subsequent mount would fail because the filesystem was not clean, and *then* the boot would stop and drop to single user. -- David Bright d...@freebsd.org ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
In message <201803101751.w2ahphph070...@pdx.rh.cn85.dnsmgr.net>, "Rodney W. Gri mes" writes: > > On Sat, Mar 10, 2018 at 10:26 AM, Ian Leporewrote: > > > > > On Sat, 2018-03-10 at 09:02 -0800, Rodney W. Grimes wrote: > > > > > > > > > > On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > > > > > > > > > [...] > > > > > > > add "-T ffs:-R" to the initial fsck invocation in rc.d/fsck. > > > > > > Please do not do that, if fsck -p fails YOU may optionally > > > > > > wish to continue, or do retries, but please do not make this > > > > > > a hardcoded situation.??At most make it a controllable knob > > > > > > that defaults to the old behavior please. > > > > > > > > > > > > Thanks you, > > > > > This whole situation with fsck retries is just very strange. ?How > > > > > many other tools in the base system exhibit this behavior:? > > > > > > > > > > I didn't do everything you asked, even though I am completely > > > > > capable of doing so. ?If you'd like to actually do the thing > > > > > you asked for, please run this program again. > > > > > > > > > > If there is some reason why fsck should do less than a complete job > > > > > under some circumstances, isn't THAT the exceptional situation that > > > > > should need a special flag to make it happen? > > > > The job is "make sure my data is ok, keep my data at all costs, do > > > > not however do something that may damange my data". > > > > > > > > The job is NOT "do everything you can to bring the file system to > > > > a consistent state, even if you have to screw my data all up". > > > > > > > > > > I'm not sure why you think the -R flag is some sort of "ruin my data" > > > request. Maybe because all of this stuff is so scantily documented in > > > the manpage? > > > > > > -R Instruct fsck_ffs to restart itself if it encounters certain > > > errors that warrant another run. > > > > > > Who knows what "certain errors" means? > > > > > > > There are some classes of errors that fsck correct that it must recompute a > > large amount of state to make sure it is consistent. Rather than doing > > that, it exits with a message saying to re-run fsck to make sure that there > > aren't more errors that were hidden by the now-corrected errors from the > > past pass. > > > > > > > Looking at the code, it appears -R has no effect if you're in preen > > > mode. Hmmm, what's "preen mode" again? Don't bother looking in the > > > man page, you'll just find a bunch of mentions of the word preen that > > > say "see the -p flag" and then, surrealistically, when you look at the > > > -p flag it says "Preen file systems (see above)". Of course, what was > > > above was all the places that told you to see -p. > > > > > > > The man page could use some improvement. Preen mode means 'fix all the > > stupid inconsistencies that crop up that never result in data loss'. > > non-preen mode means to do that, and ask if you want to correct other > > errors that usually don't cause data loss, but might and some modicum of > > human intelligence is required to tell the two apart. Eg, I usually give up > > hitting 'y' after a dozen or so times in FSCK unless I have a specific > > reason to keep going. fsck -y has no such nuance. > > I do not believe that normal mode has any intellegnce to as if data > loss will or will not occur. It will gladly ask you if you want to > clear an inode that is the root of a rather large tree, and you end > up with either data loss, or a huge lost+found, sometimes even over > flowing the size of lost+found (though that may of been fixed in ufs2). > > It simply runs along and if it finds an error it asks if you want > to correct it or not. Y is not always the correct answer, but > most people are oblivious to what the questions imply with respect > to the file system, and hence answer Y. fsck does do thing in > a sequence that tries to make Y the correct answer, but as you > say human intelligence may do better. > > Some times if you had answered N at the right question you would not > of gotten all of the other 11 questions that lead you to giving up, > sometimes the N answer maybe 100's of Y's in, often to a clear > inode question. > > When I get a preen failure my usual next step is to run a logged > fsck -n to see what that says so I can evaluate the extent of fs > damage, especially if this is a critical file system containing > very valuable data. > > > Warner > > > > > > > So, I guess I'll just keep using fsck_y_enable=YES and relying on the > > > fact that by default that now includes the -R option. > > And if your running ufs2 with soft updates your in a > pretty safe place. I would not recommend doing this on ufs1 > or without soft updates enabled. > > One must try to remeber that fsck -p during /etc/rc processing can > run into many different file systems, some more resilent to running > things like fsck -R -y, some not. Having been in this situation with FreeBSD, Solaris, Linux,
Re: svn commit: r328013 - head/sbin/fsck_ffs
In message <1520702802.84937.126.ca...@freebsd.org>, Ian Lepore writes: > On Sat, 2018-03-10 at 09:02 -0800, Rodney W. Grimes wrote: > > > > > > On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > > > > > [...] > > > > > add "-T ffs:-R" to the initial fsck invocation in rc.d/fsck. > > > > Please do not do that, if fsck -p fails YOU may optionally > > > > wish to continue, or do retries, but please do not make this > > > > a hardcoded situation.??At most make it a controllable knob > > > > that defaults to the old behavior please. > > > > > > > > Thanks you, > > > This whole situation with fsck retries is just very strange. ?How > > > many other tools in the base system exhibit this behavior:? > > > > > >     I didn't do everything you asked, even though I am completely > > >     capable of doing so. ?If you'd like to actually do the thing > > >   you asked for, please run this program again. > > > > > > If there is some reason why fsck should do less than a complete job > > > under some circumstances, isn't THAT the exceptional situation that > > > should need a special flag to make it happen? > > The job is "make sure my data is ok, keep my data at all costs, do > > not however do something that may damange my data". > > > > The job is NOT "do everything you can to bring the file system to > > a consistent state, even if you have to screw my data all up". > > > > I'm not sure why you think the -R flag is some sort of "ruin my data" > request.  Maybe because all of this stuff is so scantily documented in > the manpage? > > -R Instruct fsck_ffs to restart itself if it encounters certain  >  errors that warrant another run. > > Who knows what "certain errors" means?  > > Looking at the code, it appears -R has no effect if you're in preen > mode.  Hmmm, what's "preen mode" again?  Don't bother looking in the > man page, you'll just find a bunch of mentions of the word preen that > say "see the -p flag" and then, surrealistically, when you look at the > -p flag it says "Preen file systems (see above)".  Of course, what was > above was all the places that told you to see -p. > > So, I guess I'll just keep using fsck_y_enable=YES and relying on the > fact that by default that now includes the -R option. That's how I've set up my firewall/gateway. For it I'm much more concerned to have it successfully boot than data loss. The reason is if I'm remote I want to be able to ssh back in. So, I'm willing to take the risk to be able to do so. Having said that, I maintain backup slices on an alternate disk in case of loss should the primary slice fail to boot. In that case data loss is tolerable to allow a better chance I can remotely ssh in. (Of course there's no 100% guarantee if there's data loss but it's better than 0% if the gateway dropped into single user state from the get-go.) With my other gear using UFS I want a failing fsck to fall to single user as I can get in using a console server to examine the damage decide for myself. Long story short, it depends. -- Cheers, Cy SchubertFreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
> On Sat, Mar 10, 2018 at 10:26 AM, Ian Leporewrote: > > > On Sat, 2018-03-10 at 09:02 -0800, Rodney W. Grimes wrote: > > > > > > > > On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > > > > > > > [...] > > > > > > add "-T ffs:-R" to the initial fsck invocation in rc.d/fsck. > > > > > Please do not do that, if fsck -p fails YOU may optionally > > > > > wish to continue, or do retries, but please do not make this > > > > > a hardcoded situation.??At most make it a controllable knob > > > > > that defaults to the old behavior please. > > > > > > > > > > Thanks you, > > > > This whole situation with fsck retries is just very strange. ?How > > > > many other tools in the base system exhibit this behavior:? > > > > > > > > I didn't do everything you asked, even though I am completely > > > > capable of doing so. ?If you'd like to actually do the thing > > > > you asked for, please run this program again. > > > > > > > > If there is some reason why fsck should do less than a complete job > > > > under some circumstances, isn't THAT the exceptional situation that > > > > should need a special flag to make it happen? > > > The job is "make sure my data is ok, keep my data at all costs, do > > > not however do something that may damange my data". > > > > > > The job is NOT "do everything you can to bring the file system to > > > a consistent state, even if you have to screw my data all up". > > > > > > > I'm not sure why you think the -R flag is some sort of "ruin my data" > > request. Maybe because all of this stuff is so scantily documented in > > the manpage? > > > > -R Instruct fsck_ffs to restart itself if it encounters certain > > errors that warrant another run. > > > > Who knows what "certain errors" means? > > > > There are some classes of errors that fsck correct that it must recompute a > large amount of state to make sure it is consistent. Rather than doing > that, it exits with a message saying to re-run fsck to make sure that there > aren't more errors that were hidden by the now-corrected errors from the > past pass. > > > > Looking at the code, it appears -R has no effect if you're in preen > > mode. Hmmm, what's "preen mode" again? Don't bother looking in the > > man page, you'll just find a bunch of mentions of the word preen that > > say "see the -p flag" and then, surrealistically, when you look at the > > -p flag it says "Preen file systems (see above)". Of course, what was > > above was all the places that told you to see -p. > > > > The man page could use some improvement. Preen mode means 'fix all the > stupid inconsistencies that crop up that never result in data loss'. > non-preen mode means to do that, and ask if you want to correct other > errors that usually don't cause data loss, but might and some modicum of > human intelligence is required to tell the two apart. Eg, I usually give up > hitting 'y' after a dozen or so times in FSCK unless I have a specific > reason to keep going. fsck -y has no such nuance. I do not believe that normal mode has any intellegnce to as if data loss will or will not occur. It will gladly ask you if you want to clear an inode that is the root of a rather large tree, and you end up with either data loss, or a huge lost+found, sometimes even over flowing the size of lost+found (though that may of been fixed in ufs2). It simply runs along and if it finds an error it asks if you want to correct it or not. Y is not always the correct answer, but most people are oblivious to what the questions imply with respect to the file system, and hence answer Y. fsck does do thing in a sequence that tries to make Y the correct answer, but as you say human intelligence may do better. Some times if you had answered N at the right question you would not of gotten all of the other 11 questions that lead you to giving up, sometimes the N answer maybe 100's of Y's in, often to a clear inode question. When I get a preen failure my usual next step is to run a logged fsck -n to see what that says so I can evaluate the extent of fs damage, especially if this is a critical file system containing very valuable data. > Warner > > > > So, I guess I'll just keep using fsck_y_enable=YES and relying on the > > fact that by default that now includes the -R option. And if your running ufs2 with soft updates your in a pretty safe place. I would not recommend doing this on ufs1 or without soft updates enabled. One must try to remeber that fsck -p during /etc/rc processing can run into many different file systems, some more resilent to running things like fsck -R -y, some not. -- Rod Grimes rgri...@freebsd.org ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Sat, Mar 10, 2018 at 10:26 AM, Ian Leporewrote: > On Sat, 2018-03-10 at 09:02 -0800, Rodney W. Grimes wrote: > > > > > > On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > > > > > [...] > > > > > add "-T ffs:-R" to the initial fsck invocation in rc.d/fsck. > > > > Please do not do that, if fsck -p fails YOU may optionally > > > > wish to continue, or do retries, but please do not make this > > > > a hardcoded situation.??At most make it a controllable knob > > > > that defaults to the old behavior please. > > > > > > > > Thanks you, > > > This whole situation with fsck retries is just very strange. ?How > > > many other tools in the base system exhibit this behavior:? > > > > > > I didn't do everything you asked, even though I am completely > > > capable of doing so. ?If you'd like to actually do the thing > > > you asked for, please run this program again. > > > > > > If there is some reason why fsck should do less than a complete job > > > under some circumstances, isn't THAT the exceptional situation that > > > should need a special flag to make it happen? > > The job is "make sure my data is ok, keep my data at all costs, do > > not however do something that may damange my data". > > > > The job is NOT "do everything you can to bring the file system to > > a consistent state, even if you have to screw my data all up". > > > > I'm not sure why you think the -R flag is some sort of "ruin my data" > request. Maybe because all of this stuff is so scantily documented in > the manpage? > > -R Instruct fsck_ffs to restart itself if it encounters certain > errors that warrant another run. > > Who knows what "certain errors" means? > There are some classes of errors that fsck correct that it must recompute a large amount of state to make sure it is consistent. Rather than doing that, it exits with a message saying to re-run fsck to make sure that there aren't more errors that were hidden by the now-corrected errors from the past pass. > Looking at the code, it appears -R has no effect if you're in preen > mode. Hmmm, what's "preen mode" again? Don't bother looking in the > man page, you'll just find a bunch of mentions of the word preen that > say "see the -p flag" and then, surrealistically, when you look at the > -p flag it says "Preen file systems (see above)". Of course, what was > above was all the places that told you to see -p. > The man page could use some improvement. Preen mode means 'fix all the stupid inconsistencies that crop up that never result in data loss'. non-preen mode means to do that, and ask if you want to correct other errors that usually don't cause data loss, but might and some modicum of human intelligence is required to tell the two apart. Eg, I usually give up hitting 'y' after a dozen or so times in FSCK unless I have a specific reason to keep going. fsck -y has no such nuance. Warner > So, I guess I'll just keep using fsck_y_enable=YES and relying on the > fact that by default that now includes the -R option. > > -- Ian > > > ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Sat, 2018-03-10 at 09:02 -0800, Rodney W. Grimes wrote: > > > > On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > > > [...] > > > > add "-T ffs:-R" to the initial fsck invocation in rc.d/fsck. > > > Please do not do that, if fsck -p fails YOU may optionally > > > wish to continue, or do retries, but please do not make this > > > a hardcoded situation.??At most make it a controllable knob > > > that defaults to the old behavior please. > > > > > > Thanks you, > > This whole situation with fsck retries is just very strange. ?How > > many other tools in the base system exhibit this behavior:? > > > > I didn't do everything you asked, even though I am completely > > capable of doing so. ?If you'd like to actually do the thing > > you asked for, please run this program again. > > > > If there is some reason why fsck should do less than a complete job > > under some circumstances, isn't THAT the exceptional situation that > > should need a special flag to make it happen? > The job is "make sure my data is ok, keep my data at all costs, do > not however do something that may damange my data". > > The job is NOT "do everything you can to bring the file system to > a consistent state, even if you have to screw my data all up". > I'm not sure why you think the -R flag is some sort of "ruin my data" request. Maybe because all of this stuff is so scantily documented in the manpage? -R Instruct fsck_ffs to restart itself if it encounters certain errors that warrant another run. Who knows what "certain errors" means? Looking at the code, it appears -R has no effect if you're in preen mode. Hmmm, what's "preen mode" again? Don't bother looking in the man page, you'll just find a bunch of mentions of the word preen that say "see the -p flag" and then, surrealistically, when you look at the -p flag it says "Preen file systems (see above)". Of course, what was above was all the places that told you to see -p. So, I guess I'll just keep using fsck_y_enable=YES and relying on the fact that by default that now includes the -R option. -- Ian ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
> On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > > [ Charset UTF-8 unsupported, converting... ] > > > > > > On Fri, Mar 09, 2018 at 09:36:25PM -0500, David Bright wrote: > > > > > > > > On Mar 9, 2018, at 17:31, Ian Leporewrote: > > > > > > > > > > > > > > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > > > > > > > > > > > > > > > > > > etc/rc.d/fsck doesn't know how to interpret the new exit code and > > > > > > now > > > > > > just drops to a single-user shell when it is encountered. [?] > > > > > > > > > > > > Is there any reason etc/rc.d/fsck shouldn't automatically retry (up > > > > > > to > > > > This is, in fact, the reason that I made the change I did. I was trying > > > > to put in a retry loop to rc.d/fsck, but found that I couldn?t get it > > > > to work because fsck and fsck_ffs were not exiting with non-zero > > > > status. The drop to single user is not really due to the specific (new) > > > > error code of 16, it is due to the fact that fsck_ffs is now exiting > > > > with a non-zero status when it hasn?t completely cleaned the file > > > > system; > > > Sure, but that's a regression IMO: before, I believe we'd successfully > > > mount the FS even without retrying fsck, and continue booting. > > > > > > > > > > > /any/ non-zero status would cause the current rc.d/fsck script to go to > > > > single user. Prior to my change, fsck_ffs was exiting with a zero > > > > status even though it had not completely cleaned the filesystem and > > > > told the user to run it again. > > > > > > > > > > > > > > > > > > > fsck_ffs already has a -R flag to automatically retry, wouldn't that > > > > > be > > > > > a better mechanism for handling this new type of retry? > > > > That?s true; however, there is currently no way to pass that flag > > > > through the filesystem-agnostic fsck wrapper called from rc.d/fsck to > > > > the filesystem-specific fsck_ffs program that it calls. One could > > > > implement a similar flag on the fsck wrapper to be passed along to the > > > > filesystem-specific checker, but I think fsck_ffs is the only one that > > > > currently implements such a flag.? > > > As was pointed out by others, this isn't true. In my experience it's > > > fsck -p that is exiting with status 16. It thus seems like it would be > > > desirable to add "-T ffs:-R" to the initial fsck invocation in > > > rc.d/fsck. > > Please do not do that, if fsck -p fails YOU may optionally > > wish to continue, or do retries, but please do not make this > > a hardcoded situation.??At most make it a controllable knob > > that defaults to the old behavior please. > > > > Thanks you, > > This whole situation with fsck retries is just very strange. ?How many > other tools in the base system exhibit this behavior:? > > I didn't do everything you asked, even though I am completely > capable of doing so. ?If you'd like to actually do the thing you > asked for, please run this program again. > > If there is some reason why fsck should do less than a complete job > under some circumstances, isn't THAT the exceptional situation that > should need a special flag to make it happen? The job is "make sure my data is ok, keep my data at all costs, do not however do something that may damange my data". The job is NOT "do everything you can to bring the file system to a consistent state, even if you have to screw my data all up". -- Rod Grimes rgri...@freebsd.org ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Sat, 2018-03-10 at 08:44 -0800, Rodney W. Grimes wrote: > [ Charset UTF-8 unsupported, converting... ] > > > > On Fri, Mar 09, 2018 at 09:36:25PM -0500, David Bright wrote: > > > > > > On Mar 9, 2018, at 17:31, Ian Leporewrote: > > > > > > > > > > > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > > > > > > > > > > > > > > > etc/rc.d/fsck doesn't know how to interpret the new exit code and now > > > > > just drops to a single-user shell when it is encountered. [?] > > > > > > > > > > Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to > > > This is, in fact, the reason that I made the change I did. I was trying > > > to put in a retry loop to rc.d/fsck, but found that I couldn?t get it to > > > work because fsck and fsck_ffs were not exiting with non-zero status. The > > > drop to single user is not really due to the specific (new) error code of > > > 16, it is due to the fact that fsck_ffs is now exiting with a non-zero > > > status when it hasn?t completely cleaned the file system; > > Sure, but that's a regression IMO: before, I believe we'd successfully > > mount the FS even without retrying fsck, and continue booting. > > > > > > > > /any/ non-zero status would cause the current rc.d/fsck script to go to > > > single user. Prior to my change, fsck_ffs was exiting with a zero status > > > even though it had not completely cleaned the filesystem and told the > > > user to run it again. > > > > > > > > > > > > > > > fsck_ffs already has a -R flag to automatically retry, wouldn't that be > > > > a better mechanism for handling this new type of retry? > > > That?s true; however, there is currently no way to pass that flag through > > > the filesystem-agnostic fsck wrapper called from rc.d/fsck to the > > > filesystem-specific fsck_ffs program that it calls. One could implement a > > > similar flag on the fsck wrapper to be passed along to the > > > filesystem-specific checker, but I think fsck_ffs is the only one that > > > currently implements such a flag. > > As was pointed out by others, this isn't true. In my experience it's > > fsck -p that is exiting with status 16. It thus seems like it would be > > desirable to add "-T ffs:-R" to the initial fsck invocation in > > rc.d/fsck. > Please do not do that, if fsck -p fails YOU may optionally > wish to continue, or do retries, but please do not make this > a hardcoded situation. At most make it a controllable knob > that defaults to the old behavior please. > > Thanks you, This whole situation with fsck retries is just very strange. How many other tools in the base system exhibit this behavior: I didn't do everything you asked, even though I am completely capable of doing so. If you'd like to actually do the thing you asked for, please run this program again. If there is some reason why fsck should do less than a complete job under some circumstances, isn't THAT the exceptional situation that should need a special flag to make it happen? -- Ian ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
[ Charset UTF-8 unsupported, converting... ] > On Fri, Mar 09, 2018 at 09:36:25PM -0500, David Bright wrote: > > On Mar 9, 2018, at 17:31, Ian Leporewrote: > > > > > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > > >> > > >> etc/rc.d/fsck doesn't know how to interpret the new exit code and now > > >> just drops to a single-user shell when it is encountered. [?] > > >> > > >> Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to > > > > This is, in fact, the reason that I made the change I did. I was trying to > > put in a retry loop to rc.d/fsck, but found that I couldn?t get it to work > > because fsck and fsck_ffs were not exiting with non-zero status. The drop > > to single user is not really due to the specific (new) error code of 16, it > > is due to the fact that fsck_ffs is now exiting with a non-zero status when > > it hasn?t completely cleaned the file system; > > Sure, but that's a regression IMO: before, I believe we'd successfully > mount the FS even without retrying fsck, and continue booting. > > > /any/ non-zero status would cause the current rc.d/fsck script to go to > > single user. Prior to my change, fsck_ffs was exiting with a zero status > > even though it had not completely cleaned the filesystem and told the user > > to run it again. > > > > > > > > fsck_ffs already has a -R flag to automatically retry, wouldn't that be > > > a better mechanism for handling this new type of retry? > > > > That?s true; however, there is currently no way to pass that flag through > > the filesystem-agnostic fsck wrapper called from rc.d/fsck to the > > filesystem-specific fsck_ffs program that it calls. One could implement a > > similar flag on the fsck wrapper to be passed along to the > > filesystem-specific checker, but I think fsck_ffs is the only one that > > currently implements such a flag. > > As was pointed out by others, this isn't true. In my experience it's > fsck -p that is exiting with status 16. It thus seems like it would be > desirable to add "-T ffs:-R" to the initial fsck invocation in > rc.d/fsck. Please do not do that, if fsck -p fails YOU may optionally wish to continue, or do retries, but please do not make this a hardcoded situation. At most make it a controllable knob that defaults to the old behavior please. Thanks you, -- Rod Grimes rgri...@freebsd.org ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Fri, Mar 09, 2018 at 09:36:25PM -0500, David Bright wrote: > On Mar 9, 2018, at 17:31, Ian Leporewrote: > > > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > >> > >> etc/rc.d/fsck doesn't know how to interpret the new exit code and now > >> just drops to a single-user shell when it is encountered. […] > >> > >> Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to > > This is, in fact, the reason that I made the change I did. I was trying to > put in a retry loop to rc.d/fsck, but found that I couldn’t get it to work > because fsck and fsck_ffs were not exiting with non-zero status. The drop to > single user is not really due to the specific (new) error code of 16, it is > due to the fact that fsck_ffs is now exiting with a non-zero status when it > hasn’t completely cleaned the file system; Sure, but that's a regression IMO: before, I believe we'd successfully mount the FS even without retrying fsck, and continue booting. > /any/ non-zero status would cause the current rc.d/fsck script to go to > single user. Prior to my change, fsck_ffs was exiting with a zero status even > though it had not completely cleaned the filesystem and told the user to run > it again. > > > > > fsck_ffs already has a -R flag to automatically retry, wouldn't that be > > a better mechanism for handling this new type of retry? > > That’s true; however, there is currently no way to pass that flag through the > filesystem-agnostic fsck wrapper called from rc.d/fsck to the > filesystem-specific fsck_ffs program that it calls. One could implement a > similar flag on the fsck wrapper to be passed along to the > filesystem-specific checker, but I think fsck_ffs is the only one that > currently implements such a flag. As was pointed out by others, this isn't true. In my experience it's fsck -p that is exiting with status 16. It thus seems like it would be desirable to add "-T ffs:-R" to the initial fsck invocation in rc.d/fsck. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Fri, 2018-03-09 at 21:36 -0500, David Bright wrote: > On Mar 9, 2018, at 17:31, Ian Leporewrote: > > > > > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > > > > > > > > > etc/rc.d/fsck doesn't know how to interpret the new exit code and > > > now > > > just drops to a single-user shell when it is encountered. […] > > > > > > Is there any reason etc/rc.d/fsck shouldn't automatically retry > > > (up to > This is, in fact, the reason that I made the change I did. I was > trying to put in a retry loop to rc.d/fsck, but found that I couldn’t > get it to work because fsck and fsck_ffs were not exiting with non- > zero status. The drop to single user is not really due to the > specific (new) error code of 16, it is due to the fact that fsck_ffs > is now exiting with a non-zero status when it hasn’t completely > cleaned the file system; /any/ non-zero status would cause the > current rc.d/fsck script to go to single user. Prior to my change, > fsck_ffs was exiting with a zero status even though it had not > completely cleaned the filesystem and told the user to run it again. > > > > > > > fsck_ffs already has a -R flag to automatically retry, wouldn't > > that be > > a better mechanism for handling this new type of retry? > That’s true; however, there is currently no way to pass that flag > through the filesystem-agnostic fsck wrapper called from rc.d/fsck to > the filesystem-specific fsck_ffs program that it calls. One could > implement a similar flag on the fsck wrapper to be passed along to > the filesystem-specific checker, but I think fsck_ffs is the only one > that currently implements such a flag. > > fsck -T ffs:-R -- Ian ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On 0309T2136, David Bright wrote: > On Mar 9, 2018, at 17:31, Ian Leporewrote: > > > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > >> > >> etc/rc.d/fsck doesn't know how to interpret the new exit code and now > >> just drops to a single-user shell when it is encountered. […] > >> > >> Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to > > This is, in fact, the reason that I made the change I did. I was trying to > put in a retry loop to rc.d/fsck, but found that I couldn’t get it to work > because fsck and fsck_ffs were not exiting with non-zero status. The drop to > single user is not really due to the specific (new) error code of 16, it is > due to the fact that fsck_ffs is now exiting with a non-zero status when it > hasn’t completely cleaned the file system; /any/ non-zero status would cause > the current rc.d/fsck script to go to single user. Prior to my change, > fsck_ffs was exiting with a zero status even though it had not completely > cleaned the filesystem and told the user to run it again. > > > > > fsck_ffs already has a -R flag to automatically retry, wouldn't that be > > a better mechanism for handling this new type of retry? > > That’s true; however, there is currently no way to pass that flag through the > filesystem-agnostic fsck wrapper called from rc.d/fsck to the > filesystem-specific fsck_ffs program that it calls. One could implement a > similar flag on the fsck wrapper to be passed along to the > filesystem-specific checker, but I think fsck_ffs is the only one that > currently implements such a flag. Sure there is. See /etc/defaults/rc.conf: fsck_y_flags="-T ffs:-R -T ufs:-R" # Additional flags for fsck -y ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On 9 March 2018 at 18:36, David Brightwrote: > On Mar 9, 2018, at 17:31, Ian Lepore wrote: >> >> On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: >>> >>> etc/rc.d/fsck doesn't know how to interpret the new exit code and now >>> just drops to a single-user shell when it is encountered. […] >>> >>> Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to > > This is, in fact, the reason that I made the change I did. I was trying to > put in a retry loop to rc.d/fsck, but found that I couldn’t get it to work > because fsck and fsck_ffs were not exiting with non-zero status. The drop to > single user is not really due to the specific (new) error code of 16, it is > due to the fact that fsck_ffs is now exiting with a non-zero status when it > hasn’t completely cleaned the file system; /any/ non-zero status would cause > the current rc.d/fsck script to go to single user. Prior to my change, > fsck_ffs was exiting with a zero status even though it had not completely > cleaned the filesystem and told the user to run it again. > >> >> fsck_ffs already has a -R flag to automatically retry, wouldn't that be >> a better mechanism for handling this new type of retry? > > That’s true; however, there is currently no way to pass that flag through the > filesystem-agnostic fsck wrapper called from rc.d/fsck to the > filesystem-specific fsck_ffs program that it calls. One could implement a > similar flag on the fsck wrapper to be passed along to the > filesystem-specific checker, but I think fsck_ffs is the only one that > currently implements such a flag. Why does it need to be filesystem specific? Can't the retry happen in the wrapper itself? -- Eitan Adler ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Mar 9, 2018, at 17:31, Ian Leporewrote: > > On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: >> >> etc/rc.d/fsck doesn't know how to interpret the new exit code and now >> just drops to a single-user shell when it is encountered. […] >> >> Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to This is, in fact, the reason that I made the change I did. I was trying to put in a retry loop to rc.d/fsck, but found that I couldn’t get it to work because fsck and fsck_ffs were not exiting with non-zero status. The drop to single user is not really due to the specific (new) error code of 16, it is due to the fact that fsck_ffs is now exiting with a non-zero status when it hasn’t completely cleaned the file system; /any/ non-zero status would cause the current rc.d/fsck script to go to single user. Prior to my change, fsck_ffs was exiting with a zero status even though it had not completely cleaned the filesystem and told the user to run it again. > > fsck_ffs already has a -R flag to automatically retry, wouldn't that be > a better mechanism for handling this new type of retry? That’s true; however, there is currently no way to pass that flag through the filesystem-agnostic fsck wrapper called from rc.d/fsck to the filesystem-specific fsck_ffs program that it calls. One could implement a similar flag on the fsck wrapper to be passed along to the filesystem-specific checker, but I think fsck_ffs is the only one that currently implements such a flag. -- David Bright d...@freebsd.org ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Fri, 2018-03-09 at 17:09 -0500, Mark Johnston wrote: > On Mon, Jan 15, 2018 at 07:25:11PM +, David Bright wrote: > > > > Author: dab > > Date: Mon Jan 15 19:25:11 2018 > > New Revision: 328013 > > URL: https://svnweb.freebsd.org/changeset/base/328013 > > > > Log: > > Exit fsck_ffs with non-zero status when file system is not repaired. > > > > [...] > etc/rc.d/fsck doesn't know how to interpret the new exit code and now > just drops to a single-user shell when it is encountered. This is > happening to me semi-regularly when my test systems crash, especially > when I test kernel panic handling. :) > > Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to > some configurable number of retries) when the new error code is seen? > The patch below seems to do the trick for me: > fsck_ffs already has a -R flag to automatically retry, wouldn't that be a better mechanism for handling this new type of retry? -- Ian ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r328013 - head/sbin/fsck_ffs
On Mon, Jan 15, 2018 at 07:25:11PM +, David Bright wrote: > Author: dab > Date: Mon Jan 15 19:25:11 2018 > New Revision: 328013 > URL: https://svnweb.freebsd.org/changeset/base/328013 > > Log: > Exit fsck_ffs with non-zero status when file system is not repaired. > > When the fsck_ffs program cannot fully repair a file system, it will > output the message PLEASE RERUN FSCK. However, it does not exit with a > non-zero status in this case (contradicting the man page claim that it > "exits with 0 on success, and >0 if an error occurs." The fsck > rc-script (when running "fsck -y") tests the status from fsck (which > passes along the exit status from fsck_ffs) and issues a "stop_boot" > if the status fails. However, this is not effective since fsck_ffs can > return zero even on (some) errors. Effectively, it is left to a later > step in the boot process when the file systems are mounted to detect > the still-unclean file system and stop the boot. > > This change modifies fsck_ffs so that when it cannot fully repair the > file system and issues the PLEASE RERUN FSCK message it also exits > with a non-zero status. > > While here, the fsck_ffs man page has also been updated to document > the failing exit status codes used by fsck_ffs. Previously, only exit > status 7 was documented. Some of these exit statuses are tested for in > the fsck rc-script, so they are clearly depended upon and deserve > documentation. etc/rc.d/fsck doesn't know how to interpret the new exit code and now just drops to a single-user shell when it is encountered. This is happening to me semi-regularly when my test systems crash, especially when I test kernel panic handling. :) Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to some configurable number of retries) when the new error code is seen? The patch below seems to do the trick for me: diff --git a/etc/defaults/rc.conf b/etc/defaults/rc.conf index 584e842bba2c..63d2fcc0be8d 100644 --- a/etc/defaults/rc.conf +++ b/etc/defaults/rc.conf @@ -95,6 +95,7 @@ root_rw_mount="YES" # Set to NO to inhibit remounting root read-write. root_hold_delay="30" # Time to wait for root mount hold release. fsck_y_enable="NO" # Set to YES to do fsck -y if the initial preen fails. fsck_y_flags="-T ffs:-R -T ufs:-R" # Additional flags for fsck -y +fsck_retries="3"# Number of times to retry fsck before giving up. background_fsck="YES" # Attempt to run fsck in the background where possible. background_fsck_delay="60" # Time to wait (seconds) before starting the fsck. growfs_enable="NO" # Set to YES to attempt to grow the root filesystem on boot diff --git a/etc/rc.d/fsck b/etc/rc.d/fsck index bd3122a20110..708d92228e3d 100755 --- a/etc/rc.d/fsck +++ b/etc/rc.d/fsck @@ -14,8 +14,82 @@ desc="Run file system checks" start_cmd="fsck_start" stop_cmd=":" +_fsck_run() +{ + local err + + if checkyesno background_fsck; then + fsck -F -p + else + fsck -p + fi + + err=$? + if [ ${err} -eq 3 ]; then + echo "Warning! Some of the devices might not be" \ + "available; retrying" + root_hold_wait + check_startmsgs && echo "Restarting file system checks:" + if checkyesno background_fsck; then + fsck -F -p + else + fsck -p + fi + err=$? + fi + + case ${err} in + 0) + ;; + 2) + stop_boot + ;; + 4) + echo "Rebooting..." + reboot + echo "Reboot failed; help!" + stop_boot + ;; + 8) + if checkyesno fsck_y_enable; then + echo "File system preen failed, trying fsck -y ${fsck_y_flags}" + fsck -y ${fsck_y_flags} + case $? in + 0) + ;; + *) + echo "Automatic file system check failed; help!" + stop_boot + ;; + esac + else + echo "Automatic file system check failed; help!" + stop_boot + fi + ;; + 12) + echo "Boot interrupted." + stop_boot + ;; + 16) + echo "File system check retry requested." + ;; + 130) + stop_boot + ;; + *) + echo "Unknown error ${err}; help!" + stop_boot + ;; + esac + + return $err +} + fsck_start() { + local err tries + if [ "$autoboot" = no ]; then echo "Fast boot: skipping disk checks." elif