Re: SU+J systems do not fsck themselves
On Wed, Dec 28, 2011 at 12:57:31AM -0700, Scott Long wrote: So, there's an assumption with SUJ+fsck that SU is keeping the filesystem consistent. Maybe that's a bad assumption, and I'm not trying to discredit your report. But the intention with SUJ is to eliminate the need for anything more than a cursory check of the superblocks and a processing of the SUJ intent log. If either of these fails then fsck reverts to a traditional scan. In the same vein, ext3 and most other traditional journaling filesystems assume that the journal is correct and is preserving consistency, and don't do anything more than a cursory data structure scan and journal replay as well, but then revert to a full scan if that fails (zfs seems to be an exception here, with there being no actual fsck available for it). As for the 180 day forced scan on ext3, I have no public comment. SU has matured nicely over the last 10+ years, and I'm happy with the progress that SUJ has made in the last 2-3 years. If there are bugs, they need to be exposed and addressed ASAP. That clears things up somewhat - thank you for taking the time to explain all that. I've got results from two other users (Cc'd) with a fsck in single user mode using the journal and not using it. One has geli, one does not, and both were with clean shutdown/boot (correct me if I'm wrong, guys). Any thoughts? = Machine 1, with journal: = Script started on Thu Dec 29 11:26:29 2011 fsck / ** /dev/ada0.eli USE JOURNAL? [yn] y ** SU+J Recovering /dev/ada0.eli ** Reading 33554432 byte journal from inode 4. RECOVER? [yn] y ** Building recovery table. ** Resolving unreferenced inode list. ** Processing journal entries. WRITE CHANGES? [yn] y ** 108 journal records in 49152 bytes for 7.03% utilization ** Freed 9 inodes (0 dirs) 0 blocks, and 1 frags. * FILE SYSTEM MARKED CLEAN * Script done on Thu Dec 29 11:26:39 2011 = Machine 1, without journal: = Script started on Thu Dec 29 11:26:49 2011 fsck / ** /dev/ada0.eli USE JOURNAL? [yn] n ** Skipping journal, falling through to full fsck ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes INCORRECT BLOCK COUNT I=251177 (8 should be 0) CORRECT? [yn] y ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 220435 files, 3945055 used, 3666151 free (17503 frags, 456081 blocks, 0.2% fragmentation) * FILE SYSTEM IS CLEAN * * FILE SYSTEM WAS MODIFIED * Script done on Thu Dec 29 11:27:08 2011 = Machine 2, with journal: = ** /dev/ada0s1a USE JOURNAL? yes ** SU+J Recovering /dev/ada0s1a ** Reading 33554432 byte journal from inode 4. RECOVER? yes ** Building recovery table. ** Resolving unreferenced inode list. ** Processing journal entries. WRITE CHANGES? yes ** 131 journal records in 11776 bytes for 35.60% utilization ** Freed 0 inodes (0 dirs) 0 blocks, and 0 frags. * FILE SYSTEM MARKED CLEAN * = Machine 2, without journal: = ** /dev/ada0s1a ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? [yn] SUMMARY INFORMATION BAD SALVAGE? [yn] BLK(S) MISSING IN BIT MAPS SALVAGE? [yn] 670213 files, 19118534 used, 54535063 free (158431 frags, 6797079 blocks, 0.2% fragmentation) * FILE SYSTEM MARKED CLEAN * * FILE SYSTEM WAS MODIFIED * ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Thu, Dec 29, 2011 at 03:02:14PM -0800, David Thiel wrote: = Machine 1, with journal: = Script started on Thu Dec 29 11:26:29 2011 fsck / ** /dev/ada0.eli Correction - machine 1 had an unclean shutdown. Will get additional logs soon. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
SU+J systems do not fsck themselves
I've had multiple machines now (9.0-RC3, amd64, i386 and earlier 9-CURRENT on ppc) running SU+J that have had unexplained panics and crashes start happening relating to disk I/O. When I end up running a full fsck, it keeps turning out that the disk is dirty and corrupted, but no mechanism is in place with SU+J to detect and fix this. A bgfsck never happens, but a manual fsck in single-user does indeed fix the crashing and weird behavior. Others have tested their SU+J volumes and found them to have errors as well. This makes me super nervous. Basically, the way SU+J seems to operate is this: http://redundancy.redundancy.org/fscklog2 Oh hey, I see you shut down uncleanly, let's check everything looks good, off you go, whee Until I actually go and fsck, when I get: http://redundancy.redundancy.org/fscklog1 So, I understand that journalling doesn't replace the need for a potential fsck (though I never had this problem with gjournal), but without a way for the system to detect that a fsck is necessary, this seems pretty much a guaranteed recipe for data corruption, and seems to offer little to no benefit over plain SU+fsck, or even just mounting async. So: is everyone else seeing this? Am I misunderstanding how SU+J should be used? How should the error resolution process really happen? Thanks, David ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 02:29:03PM -0800, Xin LI wrote: I'm not sure if your experiments are right here, the second log shows you're running it read-only, which is likely caused by running it on live file system. Yes, this most recent instance is me running it on a live FS, because I'm using that machine to type this right now. :) However, I've had the issues fixed in single-user on other systems and had the problems go away. At least for a bit. - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. When you say use journalled fsck, what's the proper way to initiate that? I don't see any journal-related options in the man page. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 02:48:22PM -0800, Xin Li wrote: - use journalled fsck; - use normal fsck to check if the journalled fsck did the right thing. Ok, here is the log of fsck with and without journal. http://redundancy.redundancy.org/fscklog3 That was done the very next boot, after a clean shutdown. The errors from the previous live fsck aren't there (oddly), but there are still are apparently some corrections made. The next fsck still complains, but doesn't give any salvage prompts. Here is jsa@'s, done on a live FS with SU+J: http://redundancy.redundancy.org/fscklog4 I'm not actually looking to solve my particular problem per se. The issue is that almost everyone I've checked with that's running SU+J gets unref'd file and other errors when they check their filesystem (with the fs live). Unless I'm missing something, a running FS should never have those kinds of errors unless you deliberately disabled fsck. This leaves only a couple options: - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should - Stuff is getting completely screwed up after boot - fsck is giving incorrect results - I'm completely clueless about how SU+J is supposed to behave or be deployed I'm pretty certain that the first is the issue here. It would be great if others could check their own SU+J filesystems so we could get a few more data points. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SU+J systems do not fsck themselves
On Tue, Dec 27, 2011 at 11:54:20PM -0700, Scott Long wrote: The first run of fsck, using the journal, gives results that I would expect. The second run seems to imply that the fixes made on the first run didn't actually get written to disk. This is definitely an oddity. I see that you're using geli, maybe there's some strange side-effect there. No idea. Report as a bug, this is definitely undesired behavior. Not impossible, but I was seeing similar issues on two non-geli systems as well, i.e. tons of errors fixed when doing a single-user non-journalled fsck, but journalled fsck not fixing stuff. I'll try to replicate on a test machine, as I already lost data on the last (non-geli) machine this happened to. For the love that is all good and holy, don't ever run fsck on a live filesystem. It's going to report these kinds of problems! It's normal; filesystem metadata updates stay cached in memory, and fsck bypasses that cache. Ok. I expected fsck would be softupdate-aware in that way, but I understand it not doing so. - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should The point of SUJ is to eliminate the need for bgfsck. Effectively, they are exclusive ideas. This is surprising to me. It is my impression that under Linux at least, ext3fs is checked against the journal, and gets a full e2fsck if it finds it's still dirty. Additionally, there's a periodic fsck after 180 days continuous runtime or x number of mounts (see tune2fs -i and -c). Is SU+J somehow implemented in such a way that this is unnecessary? What does it do that the ext3fs people have missed? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: TR : IPFilter
On Sun, Feb 09, 2003 at 07:42:42PM +0100, Coercitas Temet'Nosce wrote: Hello all, I was just wondering something regarding IPFilter and new FreeBSD 5.0 First, I was looking for IPF related functions in new Kernel building, didn't found them anywhere.maybe I did something wrong but not likely. Is it now a non kernel related application ? The kernel options have moved. Options that aren't platform specific are in /usr/src/sys/conf/NOTES, and the IPFILTER options are there. Btw, I was looking for some docs on the FreeBSD website and didn't found anything interesting, only firewall that FreeBSD seems to support nowadays is the old IPFW, which is quite obsolete now imo. Why are documentation pages not dealing with IPF at all ? is there any reason ? There's no real need for them. Just compile the kernel with the appropriate options and there's plenty of docs on IPF that can tell you the rest. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
new wi driver problems
A couple things regarding this new wireless driver - the wepkey option to ifconfig no longer seems to work; I get a SIOCS80211: Invalid argument. Secondly and more importantly, even when the wepkey is set via wicontrol, I can't seem to get any connectivity at all anymore. ifconfig wi0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 inet6 fe80::202:2dff:fe0c:ec4b%wi0 prefixlen 64 scopeid 0x3 inet 10.0.0.2 netmask 0xff00 broadcast 10.0.0.255 ether 00:02:2d:0c:ec:4b media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps) status: associated ssid myssid 1:myssid stationname FreeBSD WaveLAN/IEEE node channel 7 authmode OPEN powersavemode OFF powersavesleep 100 wepmode MIXED weptxkey 1 wepkey 1:128-bit dmesg: wi0: WaveLAN/IEEE at port 0x100-0x13f irq 11 function 0 config 1 on pccard0 wi0: 802.11 address: 00:02:2d:0c:ec:4b wi0: using Lucent Technologies, WaveLAN/IEEE wi0: Lucent Firmware: Station (7.52.1) wi0: supported rates: 1Mbps 2Mbps 5.5Mbps 11Mbps uname: FreeBSD sartre.redundancy.org 5.0-CURRENT FreeBSD 5.0-CURRENT #5: Fri Jan 17 12:15:30 PST 2003 root@:/usr/obj/user/src/sys/SARTRE i386 But I'm unable to ping my gateway, a -STABLE box with the same card. I did recompile with device wlan, and tried the generic kernel as well. Disabling WEP has no effect. Could someone give me a pointer as to how to debug this? Thanks, David To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message