Joel Rees said:
> Now, what would you do with this?
> 
> ジョエル
> 
> Why not decompose it to the following?
> 
> ジョエル

Because it is not what Unicode normalization is.

> I know what the Unicode rules say, but my boss says, if I'm going to
> play with file names, he wants it done his way.

And now you suggest that idea of enforcing local filename policy is bad
idea because local filename policy might not be sane.  Ok.

First, let's decouple NFD suggestion from local policy.  Again, no
problems with NFD here.  I don't really see any sense in local policy
that demands this conversion, but if your boss needs it, it is not my
business.  I can't get why mention it though: it is completely unrelated
problem.

> You have to keep rules about making file names for internal use
> separate from rules about storing filenames received, or the internal
> system loses its meaning.

And now you speak of normalization or of local policy?  At any rate, any
incoming file has a name, which is encoded somehow.  It may be encoded
in utf-16le, for example.  Now, either you store a filename that you
can't read without using iconv or another tool of a kind, or you convert
the name to your locale.  If your locale happens to use utf-8, you still
have to convert byte sequence to another byte sequence.  The conversion
I proposed would convert destructively, but maintaining Unicode
equivalence, so aside from subtle technical (choice of canonical form)
the set of glyphs that makes the filename would remain exactly the same.
This is not even a policy, just consistent representation.

-- 
Dmitrij D. Czarkoff

Reply via email to