On Tue, 19 Jul 2005, Theodore Ts'o wrote:
On Tue, Jul 19, 2005 at 02:46:42AM -0400, Ariel wrote:Package: e2fsprogs Version: 1.37-2 Severity: important This was running on a system with a ro / fs changed via remount to rw. (So /etc/mtab was not real, since nothing rewrote it afterward.) I don't know if this is relevant - since I did the exact same thing in the same session to another fs, and that worked fine, created a .journal file.This was the root cause. /etc/mtab has to be real, because tune2fs needs to know whether or not the device is mounted. Actually, the library will /proc/mounts first, so presuambly you didn't have /proc mounted either. If you don't have /proc mounted, but you are mounting disks and then trying to use tune2fs to add journals, you're just asking for trouble.
I did have /proc mounted! And it worked fine in the same session on another fs. Also, can't the tools tell if a fs is mounted by looking in the superblock (i.e. it's dirty?)
And, BTW it created a .journal file (as you can see in the fsck) so it must have known it was mounted.
It sounds like you were trying to add a journal very early in the process, in single-user mode.
That's correct. I set SULOGIN in /etc/default/rcS and that gets me a shell where I do this sort of tasks.
Why were you doing this and having /dev/md1 mounted in the first place?
I was experimenting a bit with journaling - I was trying to create a filesystem with a .journal file, rather then a hidden inode.
I am aware I could have done this safer/easier - I just wanted to report the bug because I wasn't doing anything not supported. (All I did was mount the fs, and -j it, nothing exotic, except that /etc/mtab was probably weird - which maybe counts as exotic, but /proc/mounts was there.)
We can make this better under Linux 2.6, since with 2.6 there is a way that we can detect that device is busy, and if /etc/mtab is bogus, we can detect that the device is busy, and cause tune2fs to abort in that case.
This was under 2.6.9. It was a raid device, could that have confused the mount detection?
a: why did tune2fs mess upSee above. This is a little bit better in e2fsprogs 1.38 if you are running on a Linux 2.6 kernel, under 2.6 we can directly detect whether or not the filesystem is mounted,b: you'll notice I had to run fsck twice to get a correct fs
I can't replicate this. The first fsck, if it cleared .journal because it was deleted (i_links_count == 0), it should complained much earlier if the superblock had the journal inode set to the inode number of .journal.
The only way I could have imagined this happened is if the filesystem was mounted when you ran e2fsck on it. Was that the case? Normally e2fsck will complain loudly, but you were running with a bogus /etc/mtab file without /proc mounted.....
It was not mounted - I rebooted at this point, and it doesn't mount /boot by default, and I certainly didn't mount it.
Is there any combination of inode parameters that could cause this? (i.e. fsck needs to be run twice, never mind how the fs got there?) It looks to my uneducated eye that the only real error is that i_links_count was 0 instead of 1, change just that and all the other errors would have gone away.
I tested this using debugfs to change the link count. And I got an even stranger result: fsck.ext3 finds no errors, but I can't mount it!
# mount -o loop d /mnt
mount: wrong fs type, bad option, bad superblock on /dev/loop1,
or too many mounted file systems
Here is exactly what I did to create this umountable fs:
# cp /dev/md1 d
# tune2fs -O ^has_journal d
tune2fs 1.37 (21-Mar-2005)
# mount -o loop d /mnt
# losetup -a
/dev/loop1: [fd01]:81924 (d)
# tune2fs -j /dev/loop1
tune2fs 1.37 (21-Mar-2005)
Creating journal inode: done
This filesystem will be automatically checked every 21 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
# umount d
# cp d d1
# debugfs -w d1
debugfs 1.37 (21-Mar-2005)
debugfs: mi .journal
Mode [0100600]
User ID [0]
Group ID [0]
Size [1048576]
Creation time [1121808381]
Modification time [1121808381]
Access time [1121808381]
Deletion time [0]
Link count [1] 0
Block count [2058]
File flags [0x50]
Generation [0x69e95ea8]
File acl [0]
High 32bits of size [0]
Fragment address [0]
Fragment number [0]
Fragment size [0]
Direct Block #0 [5292]
Direct Block #1 [5293]
Direct Block #2 [5294]
Direct Block #3 [5295]
Direct Block #4 [5296]
Direct Block #5 [6437]
Direct Block #6 [6438]
Direct Block #7 [6439]
Direct Block #8 [6440]
Direct Block #9 [6441]
Direct Block #10 [6442]
Direct Block #11 [6443]
Indirect Block [6444]
Double Indirect Block [6701]
Triple Indirect Block [0]
debugfs: quit
# fsck.ext3 -f d1
e2fsck 1.37 (21-Mar-2005)
Backing up journal inode block information.
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 18 has zero dtime. Fix<y>? yes
Pass 2: Checking directory structure
Entry '.journal' in / (2) has deleted/unused inode 18. Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(5292--5296) -(6437--7215) -(7237--7481)
Fix<y>? yes
Free blocks count wrong for group #0 (2495, counted=3524).
Fix<y>? yes
Free blocks count wrong (8986, counted=10015).
Fix<y>? yes
Inode bitmap differences: -18
Fix<y>? yes
Free inodes count wrong for group #0 (4, counted=5).
Fix<y>? yes
Free inodes count wrong (76, counted=77).
Fix<y>? yes
/boot: ***** FILE SYSTEM WAS MODIFIED *****
/boot: 43/120 files (18.6% non-contiguous), 13089/23104 blocks
# fsck.ext3 -f d1
e2fsck 1.37 (21-Mar-2005)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/boot: 43/120 files (18.6% non-contiguous), 13089/23104 blocks
# mount -o loop d1 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/loop1,
or too many mounted file systems
(could this be the IDE device where you in fact use
ide-scsi so that sr0 or sda or so is needed?)
I ran e2image -r d - | bzip2 > d.bz2 and on d1 have attached them. (10KB each.)
I also tried it with a new empty filesystem I created, and that worked
properly:
Superblock has a bad ext3 journal (inode 12). Clear<y>? yes
c: the kernel should probably have reacted a little more sensibly to the error - i.e. don't send zillions of identical messages - kick out the fs, maybe return an error to the process trying to read, but not tons of useless messages that effectively froze the machine. (I checked and it was mounted errors=continue, since that's the default.)Because of (a), tune2fs wrote directly to a mounted filesystem, and that results in enough filesystem corruption that the kernel was going to start complaining pretty loudly. We'd need to see what the messages were, but in any case, that's not an e2fsprogs issue....
Sorry, I should report this to the kernel guys. But I still am quite sure tune2fs knew that the fs was mounted.
-Ariel
d.bz2
Description: Binary data
d1.bz2
Description: Binary data

