Nikolay Kuznetsov (JIRA) wrote:
[ http://issues.apache.org/jira/browse/HARMONY-688?page=comments#action_12418290 ]
Nikolay Kuznetsov commented on HARMONY-688:
-------------------------------------------
Yes, we do not support supplementary characters. The main reason for this was
that such a support breaks quantifiers optimizations over character classes of
fixed length(we support 1:-)). Now I think that I can support two different
types of character classes: one for fixed with 1(2), second for unknown(1 or 2,
\\p{javaLowerCase}, for instance).
Great! Now I'm eager for this function. Thanks a lot. ;-)
BTW, am I right that if we do not take into account unicode normalization support this problem affects only character classes and ranges behaviour?
Yes, I think so.
In all the other cases it's impossible to construct such a pattern which will
work incorrectly, if not could you please give me an example.
I'm not sure. At least, I cannot give the example. ;-)
Thanks.
Nik.
java.util.regex.Matcher does not support Unicode supplementary characters
-------------------------------------------------------------------------
Key: HARMONY-688
URL: http://issues.apache.org/jira/browse/HARMONY-688
Project: Harmony
Type: Bug
Components: Classlib
Reporter: Richard Liang
Hello Nikolay,
The following test case pass on RI, but fail on Harmony. Would you please have
a look at this issue? Thanks a lot.
public void test_matcher() {
Pattern p = Pattern.compile("\\p{javaLowerCase}");
Matcher matcher = p.matcher("\uD801\uDC28");
assertTrue(matcher.find());
}
Best regards,
Richard
--
Richard Liang
China Software Development Lab, IBM