On 01/26/2014 11:35 PM, Bernhard Voelker wrote: > On 01/26/2014 12:28 PM, Pádraig Brady wrote: >> On 01/25/2014 11:55 PM, Bernhard Voelker wrote: >>> However, I remember some other corner cases with eclipsed file >>> systems in the Fedora bug tracker. I think we're quite close >>> to solve them all this time (hopefully). >>> The idea was to trust the order of mount entries returned by >>> the kernel, i.e. in the loop over the mount entries, if the >>> mount point is the same one as a previous one, then we should >>> process the one mounted later. >>> >>> E.g. the situation where 2 file systems are mounted on the >>> same mount point: >>> >>> $ findmnt | grep loop >>> └─/mnt /dev/loop0 ext4 >>> rw,relatime,data=ordered >>> └─/mnt/dir /dev/loop1 ext4 >>> rw,relatime,data=ordered >>> └─/mnt/dir /dev/loop2 ext4 >>> rw,relatime,data=ordered >>> >>> df - the new one with your patch - still shows the wrong device: >>> >>> $ src/df | grep loop >>> /dev/loop0 122835 1551 112110 2% /mnt >>> /dev/loop1 122835 1550 112111 2% /mnt/dir >>> >>> It should say /dev/loop2 here. BTW the numbers are correct. > > BTW: the fstype is wrong, too (which can only be seen with -T or --output, > and if it differs, of course). > >> Right, that could be handled easy enough. >> loop1 is not accessible above and so should be hidden. >> But consider a bind mount resulting in something like: >> >>> └─/mnt /dev/loop0 ext4 >>> rw,relatime,data=ordered >>> └─/mnt/dir /dev/loop1 ext4 >>> rw,relatime,data=ordered >>> └─/some/place/else /dev/loop1 ext4 >>> rw,relatime,data=ordered >>> └─/mnt/dir /dev/loop2 ext4 >>> rw,relatime,data=ordered >> >> If we did a linear scan through that, we'd lose the /some/place/else >> due to it being a longer mount dir, and then also the original loop1 >> as we took /dev/loop2 for /mnt/dir. >> Seems like when discarding we would need to see if this was the >> last entry for a device and then see if there are any other candidate >> mount points for that device? > > Hi Padraig, > > thanks. > Again, mount_list is a little beast - more below. > > The following patch (on top of yours) would handle both cases > without a problem. Feel free to squash it in, if you like. > > diff --git a/src/df.c b/src/df.c > index 23b5156..78768cc 100644 > --- a/src/df.c > +++ b/src/df.c > @@ -631,9 +631,20 @@ filter_mount_list (void) > else > { > /* If we've already seen this device... */ > + struct devlist *d = NULL; > for (devlist = devlist_head; devlist; devlist = devlist->next) > if (devlist->dev_num == buf.st_dev) > - break; > + { > + d = devlist; > + if (!STREQ (devlist->me->me_devname, me->me_devname)) > + { > + /* Fix the devname if the mount dir has been > + mounted over by a different devname. */ > + free (devlist->me->me_devname); > + devlist->me->me_devname = xstrdup (me->me_devname); > + } > + } > + devlist = d; > > if (devlist) > { > > But there is yet another issue with the -a mode for such > over-mounted and therefore eclipsed file systems: > > # Create 2 file system images: 1 ext4, 1 xfs. > $ dd if=/dev/zero bs=1M status=none count=128 of=img1 > $ dd if=/dev/zero bs=1M status=none count=256 of=img2 > $ mkfs -t ext4 -F img1 >/dev/null 2>&1 > $ mkfs -t xfs -f img2 >/dev/null 2>&1 > $ mkdir /mnt{1,2} > > # Mount both on /mnt1. > $ mount -o loop img1 /mnt1 > $ mount -o loop img2 /mnt1 > > # Mount the former (ext4) also on /mnt2 via its loop device. > $ mount /dev/loop0 /mnt2 > > # Result: > $ findmnt --output=TARGET,SOURCE,FSTYPE | grep loop > ├─/mnt1 /dev/loop0 ext4 > │ └─/mnt1 /dev/loop1 xfs > └─/mnt2 /dev/loop0 ext4 > > Everything is fine now with the filtered df run ... > > $ src/df --out -h | grep loop > /dev/loop1 xfs 256K 3 256K 1% 252M 13M 239M 6% - > /mnt1 > /dev/loop0 ext4 32K 11 32K 1% 120M 1.6M 110M 2% - > /mnt2 > > ...but "df -a" prints the wrong statistics for the "over-mounted" /mnt1! > > $ src/df --out -h -a | grep loop > /dev/loop0 ext4 256K 3 256K 1% 252M 13M > 239M 6% - /mnt1 > /dev/loop1 xfs 256K 3 256K 1% 252M 13M > 239M 6% - /mnt1 > /dev/loop0 ext4 32K 11 32K 1% 120M 1.6M > 110M 2% - /mnt2 > > Okay, this is nothing new. > BTW: strictly speaking, also the output of today's "df -t rootfs -a" > is wrong because the numbers are definitely not that of the early-boot > rootfs file system. > > Now, how should df handle this? > > a) > df silently filters out the mount entries of all eclipsed mount dirs, > even with -a. > --> Hmm, I think this would probably contradict to POSIX. > > b) > df prints an error diagnostic for each eclipsed mount dir, and exits > non-Zero. > --> Well, there are probably such mounts on every system, e.g. on my box: > > TARGET SOURCE FSTYPE > /proc/sys/fs/binfmt_misc systemd-1 autofs > /proc/sys/fs/binfmt_misc binfmt_misc binfmt_misc > > Therefore, a "df -a" would always fail. ;-( > At least on my system, there are > > c) > df prints a warning diagnostic for each eclipsed mount dir, and exits > Zero (unless another error occurs). > > --> Due to the same reason as in b), these warning might be messy > and users will probably be irritated. > > d) > df outputs "-" for all numbers of such eclipsed file systems, e.g. > > $ src/df --out -h -a | grep mnt1 > /dev/loop0 ext4 - - - - - - > - - - /mnt1 > /dev/loop1 xfs 256K 3 256K 1% 252M 13M > 239M 6% - /mnt1 > > > Maybe d) is the best solution, as it mirrors what df can know: > it knows source, target and the file system type, but it doesn't > have access to the block and inode numbers. > > WDYT?
Thanks for the nice analysis and tests. d) seems like the best option here, though we'd have to be careful about cases where /proc/mounts was giving a system wide view, while df wasn't privy to that due to mount namespaces or overmounts etc. I'm not thinking of a specific issue here, just the general problem. wrt c) and annoying warnings, I also notice `df -a` on a default Fedora 20 install here, giving multiple duplicate warnings like: df: ‘net:[4026532416]’: No such file or directory df: ‘net:[4026532416]’: No such file or directory That's due to: $ grep net: /proc/mounts proc net:[4026532416] proc rw,nosuid,nodev,noexec,relatime 0 0 proc net:[4026532416] proc rw,nosuid,nodev,noexec,relatime 0 0 Which is due to support for namespaces. Seems like we should not try to lookup non absolute mount points? Also a general point is that a lot of stuff has changed underneath us recently, and perhaps we should be looking at abstracting that away somewhere (like libmount that is part of util-linux). In the short term anyway we should fix up the above warts within df. thanks, Pádraig. thanks, Pádraig.