Regular Expressions starting with "." in non-XML Schema mode use wrong first
character optimization
---------------------------------------------------------------------------------------------------
Key: XERCESJ-1126
URL: http://issues.apache.org/jira/browse/XERCESJ-1126
Project: Xerces2-J
Type: Bug
Versions: 2.7.1
Reporter: Martin Probst
The following Java snippet prints "not matched", but should print "matched".
RegularExpression regex = new RegularExpression(".oo", "");
if (regex.matches("foo")) System.out.println("matched");
else System.out.println("not matched");
It uses the class org.apache.xerces.impl.xpath.regex.RegularExpression.java. I
believe this happens because of the first character optimization kicking in and
checks for a first character of "o", does not match the 'f' and then
consequently returns false. This may be caused by this code snippet from
Token.java:493
case DOT: // ****
if (isSet(options, RegularExpression.SINGLE_LINE)) {
return FC_CONTINUE; // **** We can not optimize.
} else {
return FC_CONTINUE;
/*
* result.addRange(0, RegularExpression.LINE_FEED-1);
* result.addRange(RegularExpression.LINE_FEED+1,
* RegularExpression.CARRIAGE_RETURN-1);
* result.addRange(RegularExpression.CARRIAGE_RETURN+1,
* RegularExpression.LINE_SEPARATOR-1);
* result.addRange(RegularExpression.PARAGRAPH_SEPARATOR+1, UTF16_MAX);
* return 1;
*/
}
I think it should unconditionally return FC_ANY for DOT, at least in the case
of a starting '.'
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]