At 10:04 AM -0400 5/6/08, Jeffrey Altman wrote:
Whatever we do we are going to have an interop problem on MacOS X but since upgrading MacOS X clients is so much easier to do than other platforms I will suggest that we bite the bullet there.Proposal: 1. MacOS X and Linux clients begin to apply NFC to all UTF-8 strings obtained from the operating system whether for directory lookup, object creation, or symlink target creation. 2. Implement NFC conversion in the Salvager. This will apply to all names in directories and will require that directory hash tables be fixed when a name is changed to NFC. It will also have to apply to symlink targets. 3. In the File Server, apply NFC conversion to the names provided in CreateFile, Link and Symlink RPCs as well as the targets in the Link and Symlink RPCs.The real problem with this problem is that once the new file server is deployed and the salvager is run against the volumes the existing MacOS X clients will fail to be able to read any files in AFS. If anyone has an idea of how to address the Unicode normalization problem going forward that doesn't result in an interop failure for existing clients, please say something.
I was part of a big transition wrt character sets many years ago (on an operating system far far away), and I appreciate this transition is going to be a headache. But it's also obvious that a global resource (such as AFS) has to have some globally-consistent definition for what a filename is. I have a vague idea that this *might* be handled by having the AFS server store an additional byte of info about names. I'm not sure if that byte should be per-filename, or per-directory. (I think there are problems with the idea either way). That extra byte would indicate if the stored filenames were NFC-normalized, or "just left alone", or perhaps some other format. The byte would either be for a single filename, or add it to the directory-info to describe the encoding for all filenames within that directory. *New* AFS clients could explicitly say "I want to work with NFC- normalized names", or for that matter, that they want the server to continue to leave the filenames alone and not-normalize them. Thus, old clients would never send the request for NFC-normalized names, and the server could make decisions based on that. I realize this idea might cause more problems than it solves, but I thought I'd toss it into the ring, and see what more-experienced minds might think about it. -- Garance Alistair Drosehn = [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Institute or [EMAIL PROTECTED] _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
