On 23 Mar 2017, at 17:57, Ed Wynne <ar...@phasic.com> wrote:
> 
>> Shouldn’t the VFS layer actually be doing this? It is part of its whole 
>> raison d’être, no? Just have -[NSURL fileSystemRepresentation] normalize 
>> things according to the correct Unicode rules, and let the VFS layer 
>> translate that to HFS+’s normalization style when dealing with HFS+.
> 
> Yes, this.
> 
> Having the conversion only available up in the Cocoa layer is an incredibly 
> poor choice. It effectively means nothing at the BSD layer will be able to 
> properly normalize file names. Having it at the VFS layer is the most sane 
> option, even with the problems that causes.

It can’t really take place at the VFS layer, because the appropriate 
normalisation is filesystem specific - some filesystems don’t normalise, others 
do, and the exact rules differ.

It *could* take place in the filesystem driver, as happens currently for HFS+.  
The problem with that is that while your software will work fine on HFS+, it 
might break if given a different filesystem to run on, which is kind of what 
this thread is all about, no?  (And we already had similar problems with 
case-sensitive HFS+ too, which usually breaks certain big brand-name 
applications software.)  I have to say I’m generally in favour of APFS 
normalising Unicode names, but I can understand that there are reasons the APFS 
team might have decided not to (it’s really up to them to elucidate what those 
reasons were).

This is a rather horrible area of filesystem work, made worse by the fact that 
many historic filesystems don’t even bother storing what character encoding was 
used.  Indeed, on such systems it’s even possible that users will use different 
encoding in different directories (:-()

Clearly, encoding detailed knowledge of appropriate normalisation on a 
per-filesystem basis in end-user applications is not a sensible approach here.  
Apple suggesting that we normalise filenames before passing them to the BSD 
layer wouldn’t be the end of the world, but it might result in some 
applications not being able to cope with some otherwise valid filenames because 
the name on disk differs from the chosen normalisation.

Another option might be to add some flags to the BSD open() API (for instance, 
O_UNICODE and O_CASEFOLD) that cause it to use a Unicode-aware comparison 
routine inside the filesystem implementation, the idea being that it will open 
a file with the exact name passed if it exists, or, if that file doesn’t exist, 
it will enumerate the containing directory looking for one that matches.  
Sadly, this enumeration would need to be recursive (since the directory name 
might have the same problem).  The Foundation framework could then use the new 
flags to obtain reasonable behaviour.

Kind regards,

Alastair.

--
http://alastairs-place.net


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to