On Jul 3, 2020, at 04:53, Ces VLC wrote:

> Some of my disks are HFS+ and others are APFS. I've been using rsync for 
> years, in order to sync some folders across all my disks. There were no 
> problems until APFS was introduced into the game.
> 
> Now, in filenames that have UTF international characters, I often hit the 
> problem of rsync deleting a file and then rewriting it again, just because 
> the UTF normalization is not the same in both disks. Other users have been 
> reporting this (see here for example: 
> https://superuser.com/questions/1513326/rsync-from-mac-os-to-synology-with-btrfs-having-issues-with-file-and-directories
>  ).
> 
> For over a year I've been tolerating this because I considered it 
> non-critical, but I feel I should fix it. However, I didn't find any posted 
> solution that could address this in a convenient and proper way.
> 
> People suggest to use the --iconv flag, but... does this mean that you need 
> to use different iconv settings depending on whether your transfer is 
> APFS->HFS+ or HFS+->APFS? If affirmative, it would be a bit clumsy, IMHO 
> (first detect the disk FS, then choose proper flags). 
> 
> Isn't there some way for dealing with this more conveniently, in a way that 
> you don't need to check the disk FS before invoking rsync?

The issue I'm familiar with is that there can be several valid ways to 
represent certain strings of UTF-8 characters. (Characters comprised of several 
symbols can be composed or decomposed.) The designers of HFS+ picked one of 
those representations as the "correct" one and normalize such strings to that 
form when writing filenames to disk. HFS+ was unusual in that regard. Most 
Linux filesystems did not normalize and instead accepted whatever bytes the 
program gave it. This could result in the problem that a file created on Linux 
and moved to an HFS+ Mac might then have a different sequence of bytes for its 
filename, though they are the same characters. (Linux would also have the 
problem that two or more different filenames could be created that would each 
have different representations of the same characters.) The problem should not 
happen when moving a file from an HFS+ Mac to a Linux machine, since the Linux 
filesystem will accept the order of bytes that HFS+ used.

APFS changes things again, so maybe you will now see some similar types of 
problems when using HFS+ and APFS together, but I couldn't tell you under what 
conditions or in what way it would manifest or what to do about it. APFS 
certainly seems more complicated, since the behavior can vary based on which OS 
version you used to create the APFS volume and whether the volume is 
case-sensitive or case-insensitive:

https://mjtsai.com/blog/2017/06/27/apfs-native-normalization/

Here's some info direct from Apple, though it is a "retired" document so maybe 
a newer version is available:

https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html


Reply via email to