On Sun, Apr 6, 2025 at 11:53 AM G. Paul Ziemba <pz-freebsd-sta...@ziemba.us> wrote: > > Summary: interaction between autounmountd and cpdup's mount-point-traversal > detection truncates tree copies early without error. > > I'm running 14-stable and am seeing this both on: > > - 14.0-STABLE built from sources of 27 Mar 2024 and also on > - 14.2-STABLE built from sources of 3 Apr 2025. > > There doesn't seem to be anything specific to 14-stable so I'll bet > this issue also manifests on earlier versions of FreeBSD. > > I think I understand what's happening (details below), but I'm > not sure about the right way to fix it. > > Scenario > > A large file tree (in my case, the FreeBSD source tree) is published > on an NFS server. > > A FreeBSD NFS client automounts a volume containing this > large file tree. > > cpdup attempts to copy the file tree to another location (in my > case, that happens to be another NFS filesystem, but I don't think > it matters). > > cpdup completes without error, however, the destination directory > is incomplete, with many empty directories. > > Analysis > > cpdup examines the device ID (st_dev) returned by stat(2) as it > traverses the source and destination trees copying directories > and files. When it finds an st_dev value different from the initial > value at the top of the respective tree, it concludes that it has > crossed a mount point and prunes the copy at that point. > > I instrumented cpdup with some additional logging to examine its > notion of the src and dst st_dev values and found that, in my > test case, in the middle of its tree copy, cpdup started getting > unexpected new values of st_dev for the src tree and skipping > all directories after that. > > --- src/cpdup.c.orig 2025-04-04 15:04:44.623646000 -0700 > +++ src/cpdup.c 2025-04-05 15:10:52.779426000 -0700 > @@ -947,10 +947,15 @@ > * When copying a directory, stop if the source crosses a mount > * point. > */ > - if (sdevNo != (dev_t)-1 && stat1->st_dev != sdevNo) > + if (VerboseOpt >= 2) > + logstd("sdevNo: %ld, stat1->st_dev: %ld\n", sdevNo, > stat1->st_dev); > + if (sdevNo != (dev_t)-1 && stat1->st_dev != sdevNo) { > + if (VerboseOpt >= 2) > + logstd("setting skipdir due to sdevNo != stat1->st_dev\n"); > skipdir = 1; > - else > + } else { > sdevNo = stat1->st_dev; > + } > > I eventually looked at the automounter and added some logging via > devd.conf: > > notify 10 { > match "system" "VFS"; > match "subsystem" "FS"; > action "logger VFS FS msg=$*"; > }; > > And saw the following in /var/log/messages: > > Apr 6 10:39:31 f14s-240327-portbuilder me[58694]: VFS FS msg=!system=VFS > subsystem=FS type=MOUNT mount-point="/s/public" > mount-dev="hairball:/v2/Source/public" mount-type="nfs" > fsid=0x94ff003a3a000000 owner=0 flags="automounted;" > Apr 6 10:49:54 f14s-240327-portbuilder me[58761]: VFS FS msg=!system=VFS > subsystem=FS type=UNMOUNT mount-point="/s/public" > mount-dev="hairball:/v2/Source/public" mount-type="nfs" > fsid=0x94ff003a3a000000 owner=0 flags="automounted;" > Apr 6 10:49:54 f14s-240327-portbuilder me[58770]: VFS FS msg=!system=VFS > subsystem=FS type=MOUNT mount-point="/s/public" > mount-dev="hairball:/v2/Source/public" mount-type="nfs" > fsid=0x95ff003a3a000000 owner=0 flags="automounted;" > > (By the way, st_dev reported by my new cpdup log messages was a > rearranged version of "fsid" in the devd messages) > > Note that after ten minutes, the NFS filesystem is unmounted and then > immediately remounted. > > The source code of /usr/sbin/autounmountd indicates that it > attempts to unmount automounted filesystems ten minutes after > they have been mounted (modulo some sleep-related jitter). > > The immediately following mount (presumably triggered by the > next filesystem access by cpdup) results in a new value of fsid, > thus changing what cpdup sees as st_dev, causing it to treat > all following directory descents as mount-point crossings. > > Possible Mitigations > > 1. It might be possible to prevent unmounting by causing cpdup > to chdir to the top of the source directory. However, it seems > to perform similar st_dev checks on the destination directory > and therefore a similar issue would arise with the dst tree. > > 2. Reusing the old fsid in the new mount? I'm guessing there > were good reasons for assigning a new fsid, so it's probably > a bad idea. > > 3. cpdup could call stat() on the top of the tree each time > it made a comparison. There might still be a race and the > comparison might fail if the automatic unmount occurred > between the two stat() calls. > > Although THAT could be worked around by retrying the two > stats + comparison once after each failure. > > Other ideas? Just do the NFS mount manually and avoid automounting it. (Or put it in /etc/fstab.)
Yes, an NFS unmount/mount can result in a different fsid rick > -- > G. Paul Ziemba > FreeBSD unix: > 11:51AM up 18 days, 2:22, 43 users, load averages: 0.42, 0.31, 0.26 >