Sorry for not responding sooner. The \p stuff in Perl is for matching
utf8 encoded characters. In Java, your characters are always going to
be in Unicode (e.g., if you read them from a file, the appropriate stream
class converts them). Therefore, the \p feature has no meaning for
regular expressions in Java and we don't need to implement it.
Could you remove the \p stuff and __parseUtf8Perl() stuff from your
patch and resubmit it? That is, unless anyone can make a convincing
case for why \p is necessary in Java. I may be missing something.
thanks,
daniel