> On 2023-02-09, at 11:35 PM, Pádraig Brady <p...@draigbrady.com> wrote:
>
> On 09/02/2023 17:23, George Valkov wrote:
>>> On 2023-02-09, at 6:32 PM, Pádraig Brady <p...@draigbrady.com> wrote:
>>>
>>> On 09/02/2023 15:57, George Valkov wrote:
>>>>> On 2023-02-09, at 1:56 PM, Pádraig Brady <p...@draigbrady.com> wrote:
>>>>>
>>>>> On 09/02/2023 09:20, George Valkov wrote:
>>>>>> Due to a bug in macOS, sparse copies are corrupted on virtual disks
>>>>>> formatted with APFS. HFS is not affected. Affected are coreutils
>>>>>> install, and gcp when compiled with SEEK_HOLE, as well as macOS Finder.
>>>>>> While reading the entire file returns valid data, scanning for
>>>>>> allocated segments may return holes where valid data is present.
>>>>>> In this case a sparse copy does not contain these segments and returns
>>>>>> zeroes instead. Once the virtual disk is dismounted and then
>>>>>> mounted again, a sparse copy produces correct results.
>>>>>> This breaks OpenWRT build on macOS. Details:
>>>>>> https://github.com/openwrt/openwrt/pull/11960
>>>>>> https://github.com/openwrt/openwrt/pull/11960#issuecomment-1423185579
>>>>>> Signed-off-by: Georgi Valkov <gval...@gmail.com>
>>>>>> ---
>>>>>> src/copy.c | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>> diff --git a/src/copy.c b/src/copy.c
>>>>>> index e16fedb28..4f56138a6 100644
>>>>>> --- a/src/copy.c
>>>>>> +++ b/src/copy.c
>>>>>> @@ -1065,7 +1065,7 @@ infer_scantype (int fd, struct stat const *sb,
>>>>>> return PLAIN_SCANTYPE;
>>>>>> }
>>>>>> -#ifdef SEEK_HOLE
>>>>>> +#if defined(SEEK_HOLE) && !defined(__APPLE__)
>>>>>> scan_inference->ext_start = lseek (fd, 0, SEEK_DATA);
>>>>>> if (0 <= scan_inference->ext_start || errno == ENXIO)
>>>>>> return LSEEK_SCANTYPE;
>>>>>
>>>>>
>>>>> Thanks for the detailed report.
>>>>> The patch might very well be appropriate.
>>>> Hi! Let’s test the ideas you have first, and fall-back to the patch.
>>>> In October 2021 macOS Finder was also affected, and that points directly
>>>> at Apple.
>>>> I tested again today, they have fixed Finder. After performing a copy in
>>>> Finder,
>>>> coreutils cp produces a good copy. I have to run
>>>> make toolchain/gcc/initial/{clean,compile} -j 16
>>>> before I can reproduce it again with the same file.
>>>> So while Apple didn't fix the underlaying issue with APFS, they did
>>>> provide a solution for Finder. And we can make coreutils work correctly
>>>> too.
>>>
>>> That suggests that Finder may have sync'd the file.
>>> Now syncing has overheads of course, so not an option to take lightly.
>> #include "stdio.h"
>> #include "unistd.h"
>> #include "fcntl.h"
>> int main(int argc, char ** argv)
>> {
>> sync();
>> int fd = open("cc1", O_RDWR);
>> //int fd = open("/Users/g/vhd/coreutils.sparseimage", O_RDWR);
>> int a = fdatasync(fd);
>> int b = fsync(fd);
>> int c = close(fd);
>> printf("fdatasync %u fsync %u close %u\n", a, b, c);
>> return 0;
>> }
>> fdatasync 0 fsync 0 close 0
>> I agree. It takes about one second to run this code.
>> All calls are successful. The sparse copy after that is still corrupted.
>> I also tried doing this on the sparse image while it is mounted.
>
> Thanks for confirming the sync doesn't help.
>
>>> We may be safer just doing the normal copy on __APPLE__ as per your orig
>>> patch.
>>> A dtruss of Finder might be instructive BTW.
>> Yes, sync is definitely not good for performance. What is ‘dtruss'?
>
> That would show the system calls that Finder is using to perform the copy.
> Perhaps Finder is also just avoiding the SEEK_DATA on APFS?
I ran 3 traces, let me know if they are useful? If not, I will have to disable
integrity protection, and collect new data tomorrow. I got errors:
dtruss: system integrity protection is on, some features will not be available
https://httpstorm.com/share/.openwrt/test/2023-02-06_coreutils-9.1/dtruss/
You did not mention which flags you want with dtruss?
> Without further info, I'll apply your original avoidance patch for __APPLE__.
It’s up to you, as long as you have ideas, I can test them.
I can also invite you on Parsec or Anydesk. Working together is more efficient.
Cheers, mate!
Georgi Valkov
httpstorm.com
nano RTOS