On Sun, Apr 6, 2025 at 11:53 AM G. Paul Ziemba
<pz-freebsd-sta...@ziemba.us> wrote:
>
> Summary: interaction between autounmountd and cpdup's mount-point-traversal
> detection truncates tree copies early without error.
>
> I'm running 14-stable and am seeing this both on:
>
> - 14.0-STABLE built from sources of 27 Mar 2024 and also on
> - 14.2-STABLE built from sources of 3 Apr 2025.
>
> There doesn't seem to be anything specific to 14-stable so I'll bet
> this issue also manifests on earlier versions of FreeBSD.
>
> I think I understand what's happening (details below), but I'm
> not sure about the right way to fix it.
>
> Scenario
>
>     A large file tree (in my case, the FreeBSD source tree) is published
>     on an NFS server.
>
>     A FreeBSD NFS client automounts a volume containing this
>     large file tree.
>
>     cpdup attempts to copy the file tree to another location (in my
>     case, that happens to be another NFS filesystem, but I don't think
>     it matters).
>
>     cpdup completes without error, however, the destination directory
>     is incomplete, with many empty directories.
>
> Analysis
>
>     cpdup examines the device ID (st_dev) returned by stat(2) as it
>     traverses the source and destination trees copying directories
>     and files. When it finds an st_dev value different from the initial
>     value at the top of the respective tree, it concludes that it has
>     crossed a mount point and prunes the copy at that point.
>
>     I instrumented cpdup with some additional logging to examine its
>     notion of the src and dst st_dev values and found that, in my
>     test case, in the middle of its tree copy, cpdup started getting
>     unexpected new values of st_dev for the src tree and skipping
>     all directories after that.
>
> --- src/cpdup.c.orig    2025-04-04 15:04:44.623646000 -0700
> +++ src/cpdup.c 2025-04-05 15:10:52.779426000 -0700
> @@ -947,10 +947,15 @@
>          * When copying a directory, stop if the source crosses a mount
>          * point.
>          */
> -       if (sdevNo != (dev_t)-1 && stat1->st_dev != sdevNo)
> +       if (VerboseOpt >= 2)
> +           logstd("sdevNo: %ld, stat1->st_dev: %ld\n", sdevNo, 
> stat1->st_dev);
> +       if (sdevNo != (dev_t)-1 && stat1->st_dev != sdevNo) {
> +           if (VerboseOpt >= 2)
> +               logstd("setting skipdir due to sdevNo != stat1->st_dev\n");
>             skipdir = 1;
> -       else
> +       } else {
>             sdevNo = stat1->st_dev;
> +       }
>
>     I eventually looked at the automounter and added some logging via
>     devd.conf:
>
>     notify 10 {
>             match "system"          "VFS";
>             match "subsystem"       "FS";
>             action "logger VFS FS msg=$*";
>     };
>
>     And saw the following in /var/log/messages:
>
> Apr  6 10:39:31 f14s-240327-portbuilder me[58694]: VFS FS msg=!system=VFS 
> subsystem=FS type=MOUNT mount-point="/s/public" 
> mount-dev="hairball:/v2/Source/public" mount-type="nfs" 
> fsid=0x94ff003a3a000000 owner=0 flags="automounted;"
> Apr  6 10:49:54 f14s-240327-portbuilder me[58761]: VFS FS msg=!system=VFS 
> subsystem=FS type=UNMOUNT mount-point="/s/public" 
> mount-dev="hairball:/v2/Source/public" mount-type="nfs" 
> fsid=0x94ff003a3a000000 owner=0 flags="automounted;"
> Apr  6 10:49:54 f14s-240327-portbuilder me[58770]: VFS FS msg=!system=VFS 
> subsystem=FS type=MOUNT mount-point="/s/public" 
> mount-dev="hairball:/v2/Source/public" mount-type="nfs" 
> fsid=0x95ff003a3a000000 owner=0 flags="automounted;"
>
>     (By the way, st_dev reported by my new cpdup log messages was a
>     rearranged version of "fsid" in the devd messages)
>
>     Note that after ten minutes, the NFS filesystem is unmounted and then
>     immediately remounted.
>
>     The source code of /usr/sbin/autounmountd indicates that it
>     attempts to unmount automounted filesystems ten minutes after
>     they have been mounted (modulo some sleep-related jitter).
>
>     The immediately following mount (presumably triggered by the
>     next filesystem access by cpdup) results in a new value of fsid,
>     thus changing what cpdup sees as st_dev, causing it to treat
>     all following directory descents as mount-point crossings.
>
> Possible Mitigations
>
>     1. It might be possible to prevent unmounting by causing cpdup
>        to chdir to the top of the source directory. However, it seems
>        to perform similar st_dev checks on the destination directory
>        and therefore a similar issue would arise with the dst tree.
>
>     2. Reusing the old fsid in the new mount? I'm guessing there
>        were good reasons for assigning a new fsid, so it's probably
>        a bad idea.
>
>     3. cpdup could call stat() on the top of the tree each time
>        it made a comparison. There might still be a race and the
>        comparison might fail if the automatic unmount occurred
>        between the two stat() calls.
>
>        Although THAT could be worked around by retrying the two
>        stats + comparison once after each failure.
>
>     Other ideas?
Just do the NFS mount manually and avoid automounting it.
(Or put it in /etc/fstab.)

Yes, an NFS unmount/mount can result in a different fsid

rick

> --
> G. Paul Ziemba
> FreeBSD unix:
> 11:51AM  up 18 days,  2:22, 43 users, load averages: 0.42, 0.31, 0.26
>

Reply via email to