On Sun, Nov 30, 2014 at 5:48 AM, Dmitrij D. Czarkoff <czark...@gmail.com> wrote: > Ingo Schwarze said: >> While the article is old, the essence of what Schneier said here >> still stands, and it is not likely to fall in the future: >> >> https://www.schneier.com/crypto-gram-0007.html#9 > > Sorry, but this article is mostly based on lack of understanding of > Unicode.
Sometimes I have found myself wondering whether Bruce Schneier's lack of erudition is studied. At any rate, I've found that, when he says "I see smoke," there is often fire somewhere in the vicinity. >> that would directly run contrary to some of OpenBSD's most important >> project goals: Correctness, simplicity, security. > > Yes, Unicode is very complex. Just complex enough that there is (to my > knowledge) no single application that does it right in every aspect. Considering that making a universal character encoding scheme is, in and of itself, a self-contradictory project, they've done moderately well, I think. > That said, the standard provides just enough facilities to make > filesystem-related aspects of Unicode work nicely, particularily in case > of utf-8. Eg. ability to enforce NFD for all operations on file names > could actually make several things more secure by preventing homograph > attacks. I think this assertion is a bit optimistic, and not just given your following caveat. > Unfortunately, there is no realistic hope that NFD will be enforced by > every OS and filesystem out there any time soon, so at this stage file > names with bytes outside printable ASCII range will cause problems at > some point. On my systems I limit filenames to [0-9A-Za-z~._/-] range. Warning! Rambling ahead: And now I find myself bemused again by my own regular tendency to be confused by the conflation of the file name database with more general purpose database indexes. Fifteen years ago, I said to someone that the useful life of the current encoding scheme in Unicode was about twenty-five years, and that they/we should be looking for good ways to restructure it. I had trouble then figuring out a way to disentangle the various requirements, and I still don't see a clear way to it. But I'm inclined to think the original idea of a 16-bit encoding was, while not correctly seeing the reality of actually characters in use, was almost seeing the requirements of the system correctly. I think we need an "international" encoding that uses a restricted subset of actual characters in use, and a structure that allows for a simpler parsing of the international encoding part. (And from here my thoughts get even less coherent. Sorry for the interruption.) -- Joel Rees Be careful when you look at conspiracy. Look first in your own heart, and ask yourself if you are not your own worst enemy. Arm yourself with knowledge of yourself, as well.