> The fact that these POSIX/ASCII only version properties/constructs > have been there for years ("compatibility") and it appears that "most" > developers are happy (habit, performance...) with them, I don't think > we can and want to switch to the Unicode version, simply for > conformance.
I agree with you 100.000%. That's why I said that the compatability issue merits a separate letter. I have a whole bunch of different ideas toward how to make everybody happy here; perhaps one or more of them will actually do that. I would never want to change things out from under people. It would take positive action on their part to get something to change. > Name space conflict is really not a big issue (for me anyway) a > possible solution is to have a prefix "Is" for all Unicode binary > properties, for example "IsAlpha", "IsLowerCase", the problem we > have here is to to provide the TR#18 compatible version for those > listed properties, if we want to continue claim tr#18 level 1. That's why Perl finally bit the bullet and allowed those properties to take on the tr18 meaning--because it said we had to. This has *not* made the POSIX-minded people happy. That's why we finally had to (VERY recently, in the devel version) add a /a or (?a) flag. It's also why we "POSIX_" to get back the POSIX senses. This may matter more in Perl than I think it does in Java, because I don't know that Java ever does anything POSIXy outside of the C locale; is this correct? In Perl or C, I'd call setlocale(). > The dis-connection between \b and \w is a headache, Yes, it really is, isn't it? Let me bounce some stuff around in my head for a while longer on the compat issue, because there's a chance a solution to the \b and \w dichotomy may fall out from that. Thank you for your time. --tom