Sherman wrote: > The CR# so far I have are
> 7014645: Support Perl style Unicode hex notation \x{...} > 7014633: Support loose matching forboth abbreviated and longer names of > Unicode priperty > 7014640: Add meta character for line ending '\R' > It might take a couple days(?) for these CR# to show up on the website. So it appears; they aren't there yet. However, I see now that some of the bugs I submitted last December *did* make it into the database. The first one shows that they've accepted that the \b vs \w thing is a bug. I can't see how to fix that without bringing one into alignment with the other, but maybe there's a way I'm not thinking of. 7006289: java.util.regex yields nonsense by breaking the connection between \b and \w Category java:classes_util State 1-Dispatched, bug Priority: 4-Low Submit Date 12-DEC-2010 7006291: Java claims to support Unicode properties, but does not Category java:classes_util State 1-Dispatched, bug Priority: 4-Low Submit Date 12-DEC-2010 > Still need some time to scope/categorize those Unicode properties > support issues, will post/send you the CR# when I have them and we can > then discuss what we can do to address those issues going forward. Perhaps there could be two RFEs, one for implementing the list of properties required for RL1.2, and the other for implementing the remaining properties defined in the various UCD *.txt files that you don't currently consider. However, I do not know that a partial solution will work well for these. For one thing, some of the properties you need for the first rely on other underlying properties. But also because to implement \X in the required(ish) sense of an Extended Grapheme Cluster instead of as a Legacy Grapheme Cluster, you need access to the properties that come out of the HangulSyllableType.txt UCD file. If you have access to a "recent" source build of Perl (5.12 or better), you can see how the logic for \X is carried out during regex execution by looking around line 3873 and after of regexec.c, which reads case CLUMP: /* Match \X: logical Unicode character. This is defined as Hope this helps. --tom