bug#79446: 9.7: cp(1): Unnecessary writes / incorrect logging messages

Kye Hunter Wed, 17 Sep 2025 10:52:34 -0700

Hi both,

Thanks for the explanations, the trace is making much more sense to me now!


> Yes. Unfortunately it's more complicated than the mode, as the
> Linux xattr library don't give 'cp' an easy way to  test whether
> the extended attributes are identical. We could add this to our
> list of things to do but any changes along these lines would be
> nontrivial.

Yeah so I see that cp is making calls to llistxattr, but that it appears to only give information on the names, but not on the values, of the xattrs. I also see that (at least on my machine) changing the xattrs does not affect the file's modification or change times (I think I have access times disabled), so that can't be used either (rather unfortunate, though I guess it makes sense). However, if it's just a matter of not knowing the values of the xattrs, then shouldn't the absence of any xattrs be enough to know that they are identical? The files I'm testing on don't have xattrs, and I sort of doubt any files on my system have them, so for my case at least this could be a good solution.

If I add --no-preserve=mode to the command, this does prevent the extra call to chmod, but adding --no-preserve=xattr still results in a call to chmod, which seems odd based on your explanation. Reading the man page again, the apparent overlap in the attributes "mode" and "xattr" seems sort of confusing. Is the behavior of the combination of --preserve=mode with --no-preserve=xattr even well defined? I guess my expectation would be that in this case the permission bits, ext attribute bits, and the ACLs are copied, but the xattrs are not. To me, another good solution here seems like it could be to remove the overlap in these two, so that it would be possible to specify copying the mode/ACLs without the xattrs, and then wouldn't be necessary to worry about the values of the xattrs for --preserve=mode without --preserve=xattrs.

In any case, in the traces I'm seeing with --preserve=mode, I don't see anywhere that cp is writing any xattrs at all, just that it makes an unnecessary chmod.

     2. The other odd thing in the trace is that there are some kind
    of odd
         shenanigans going on with the two identical binary files, and a
         third temporary file that cp makes and then removes.
The point is that they're not *supposed* to be "identical binaries"; they're supposed to be multiple names for ONE file, because you specified --preserve=all, which includes replicating the original pattern of (hard) links in the target.

Yeah so I think I had never actually had hardlinked files come up in any real situation before, so I'm not surprised I didn't catch that! But yes, I can confirm that is what's going on here. But I'm not convinced that the behavior with hardlinks is totally logical here. The first odd thing would seem to be that these files are being relinked every time that cp is run, even when they're already linked. But it gets stranger—starting with an empty target directory, if I run this command: cp --no-preserve=links --update=older --target-directory="/home/kye/test" ./* then the first time I run it, the files hardlinked in the source directory are not hardlinked in the target (seemingly correct), but when running it the second time (and afterwards) the two files then do become hardlinked, apparently contradicting the --no-preserve=links option.

So, it makes sense that hardlinked files were being hardlinked in the target directory when I had --preserve=all, but maybe it doesn't make sense to relinking them over-and-over when they are already linked. And it seems like --update=older is somehow forcing the files to be hardlinked, but only when they already exist in the target directory.

I think that when I was using ZFS, these hardlinked files were not present, so I never saw them in diffs back then, but I checked and if I avoid having the mode of directories reset, these redundant links are still incrementing the generation number on btrfs. This moves them into the category of problematic (for me) and it would be great if they weren't occurring.

(You might like to submit a Linux kernel feature request to implement an extra option flag for 'linkat' to do the same, perhaps AT_REPLACE.)

I think I can see how that could be helpful in this case, but perhaps I'm not the best person for the task. If no one else wants to though, I guess I could give it a shot. Would being able to directly replace a file with a hardlink to another file make it easier to avoid unnecessarily replacing one hardlink with itself? Or would it just be for convenience in cases like this one?

I have to admit I'm a bit puzzled by
unlinkat(3, "./CuYCWUCU", 0)            = 0
which would appear to be successfully removing a filename that no longer exists.


I agree, seems a bit fishy.

Thanks,

-Kye

--
Kye E. Hunter
PGP: 6859 E2DE D598 49EA 9319  10CD DEF2 BA03 A6BE 3062
--

OpenPGP_0xDEF2BA03A6BE3062.asc
Description: OpenPGP public key

OpenPGP_signature.asc
Description: OpenPGP digital signature

bug#79446: 9.7: cp(1): Unnecessary writes / incorrect logging messages

Reply via email to