On Thu, 2007-11-15 at 09:54 -0500, C. Scott Ananian wrote: > Joyride-277 doesn't validate, because it contains a file from the > library with a filename in non-normalized unicode. The file is named > 'Annobo?n_Bioko-thumb.jpg', where the ? should be a separated accent > on the o, but it is actually stored on the filename with a combined > 'o+accent' glyph. > > Now, at first blush this is a bug in the (fast) contents verifier, > which I will fix: all strings should be unicode-normalized before they > are compared. But it seems like this raises issues with (for example) > URLs to library content. Should we enforce the constraint that all > filenames are unicode-normalized on disk, so that we can guarantee > that a (unicode-normalized) URL will always resolve correctly?
Everything on disk should be UTF-8. Anything that's not UTF-8 will not be guaranteed to work. Filenames need to be converted to UTF-8 before the file is opened/created/renamed/etc. Dan > Otherwise we run the risk of someone editing a file and resaving it > with a name which *appears* identical, but is actually encoded > differently on disk, and having URLs to the file mysteriously break. > > For the technically-minded, we're talking about using the UTF-8 > encoding of Unicode Normalization Form D, as discussed (briefly) at > http://wiki.laptop.org/go/Canonical_JSON. The problem has arisen > because the old libraries used normalized filenames, but we've > switched to installing the libraries from RPMs, and apparently > non-normalized filenames have snuck in. If I were to hazard a guess, > I'd say that the tar command normalizes filenames as they are > archived, while RPM does not. > > My proposal is to ensure that all filenames in the base system (at > least) are in normalization form D. I will write a checker in the > build process to ensure this, and we should probably eventually write > checkers for the activity/library bundle tools that will do the same. > --scott > _______________________________________________ Devel mailing list [email protected] http://lists.laptop.org/listinfo/devel
