> On 2023-02-14, at 2:12 PM, Pádraig Brady <p...@draigbrady.com> wrote: > > On 14/02/2023 06:12, Paul Eggert wrote: >> On 2023-02-13 09:16, George Valkov wrote: >>> 1. We apply my original patch to disable sparse-copy on macOS. Otherwise >>> since fclonefileat is not used whenever we overwrite a file, corruption >>> will still occur. >> I'm not entirely sold on this patch, because I still don't understand >> what's happening. The original bug report in >> <https://github.com/openwrt/openwrt/pull/11960> basically says only "it >> doesn't work", and I'd like to know more. >> Part of the reason I'm hesitating is that there are multiple ways of >> interpreting what SEEK_HOLE and SEEK_DATA mean, and perhaps it's just >> that Apple has come up with yet another way to interpret it, which we >> need to know about.
I created a sparse copy sample that traces all calls. It relays only on the manpage for lseek. I should note that I am not familiar with the implementation used by coreutils, which mages it a good ground for independent research on the topic. The code is attached as d.c Here are my observations: 1. Given a non-sparse file A, it produces an exact copy B. gcc d.c && ./a.out src 3 dst 4 c 553 p 553 h 553 d 0 total bytes copied 553 / 553 a2d8f888d5c88f6eef83a3bf4ef2434a85a64792 A a2d8f888d5c88f6eef83a3bf4ef2434a85a64792 B 2. Given cc1, it produces a corrupted copy cc1-sparse, that matches the checksum of the copy cc-ori created by coreutils/src/cp. So unless you can find a flaw in my implementation, we can safely assume that SEEK_DATA skips valid data blocks and hence it should not be used on macOS because it is broken: ./coreutils/src/cp cc1 cc1-ori gcc d.c && ./a.out src 3 dst 4 c 14053376 p 20344832 h 20344832 d 6291456 total bytes copied 14053376 / 27551296 e8eb21ec118ff3a8fce3ad85d5f9a72d47a257c6 cc1 7b447132f56f3b549ef62a4780945e82d26f7d2c cc1-ori 7b447132f56f3b549ef62a4780945e82d26f7d2c cc1-sparse -rwxr-xr-x 1 g staff 27551296 Feb 15 09:59 cc1* -rwxr-xr-x 1 g staff 27551296 Feb 15 14:29 cc1-ori* -rwx------ 1 g staff 27551296 Feb 15 14:29 cc1-sparse* Further research: it would be interesting to find how my sparse copy sample, as well as coreutils/src/cp handle custom crafted sparse files. My first idea would be a file with the same size as cc1 (27 551 296 bytes) with a couple of data blocks in the middle. My second idea is to read the entire cc1, then call ftruncate to set the size of the copy and write only those bytes that are not 0. > I agree it would be good to know more. > However given it works on HFS but fails on APFS suggests a file system > specific issue, > rather than an API mismatch issue (over what we're already catering for on > macOS). > I suspect it's a similar issue to the one that openzfs had: > https://github.com/openzfs/zfs/issues/11900 According to Wikipedia: > Apple's HFS+ does not provide support for sparse files, but in OS X, the > virtual file system layer supports storing them in any supported file system, > including HFS+. https://en.wikipedia.org/wiki/Sparse_file Hence the reason we don’t observe any issues on HFS+ is most likely that the files we try to copy from it are not sparse in first place. > Given how closed / uncommunicative Apple are in general > and specifically for this already reported bug, > I'm inclined to disable SEEK_DATA on __APPLE__ for the upcoming release. > > Also we've mitigated the impact somewhat by enabling fclonefileat() in more > cases. I agree. We can try to send them one last message with a link to this group, and be done with it if they don’t relay. I just found a feedback ticket from 2022-09-16, where they replayed asking me to test macOS 13 Beta 8. At that time Finder was still broken, and now it isn’t. So they might have at least partially addressed the issue. https://feedbackassistant.apple.com/feedback/11522527 >> Another reason to hesitate is that GNU coreutils is not the only core >> program that uses SEEK_HOLE and SEEK_DATA. If this really is a generic >> Apple problem we need to know which Apple releases have it, so that we >> can program around it at the Gnulib level instead of mucking about with >> each individual program. We need to start somewhere. Fixing cp and install will certainly be helpful to users, until you address the issue globally. It might be possible to install VMs with various versions of macOS, though I suspect it will take time. Pádraig, did your try cp on build_dir/toolchain-arm_cortex-a9+vfpv3-d16_gcc-12.2.0_musl_eabi/gcc-12.2.0-initial/gcc/cc1 after building gcc in OpenWRT build root? make toolchain/gcc/initial/{clean,compile} -j 16 Here is a step by step article how to setup the build environment: https://httpstorm.com/share/.openwrt/help/OpenWRT%20build%20on%20macOS.htm > I've attached the 3 latest patches I'm considering in this area. > I presume you're OK with your amended fclonefileat() improvement one? Thank you! Yes, I’m all for the performance benefits of using fclonefileat. I applied your patches on top of cf80f988eeb97cc3f8c65ae58e735d36f865277b: Both rewriting an existing file and cloning to a new one produce a correct copies of a sparse source. Here are the test results: https://httpstorm.com/share/.openwrt/test/2023-02-06_coreutils-9.1/test-03-cf80f988eeb97cc3f8c65ae58e735d36f865277b-clone.txt I wrote a sparse copy sample
d.c
Description: Binary data
Cheers! Georgi Valkov httpstorm.com nano RTOS