[ http://issues.apache.org/jira/browse/XERCESJ-1061?page=comments#action_12438019 ] Michael Glavassevich commented on XERCESJ-1061: -----------------------------------------------
> Ok. This statement has more lines in it than the fix... Is there a > way I can avoid this in future? Yes. Fax in a CLA to Apache: http://www.apache.org/licenses/icla.txt. This would cover contributions to Xerces and any other Apache project. > a) Name: Chris Carman > Employer: not applicable, done on my own time I still need an answer here. If you'd rather not share where you work in public you can fax a CLA to Apache instead. > Regex "$" and "^" characters treated as special chars in conflict with XML > Schema spec > -------------------------------------------------------------------------------------- > > Key: XERCESJ-1061 > URL: http://issues.apache.org/jira/browse/XERCESJ-1061 > Project: Xerces2-J > Issue Type: Bug > Components: XML Schema datatypes > Affects Versions: 2.6.2 > Environment: Test Environment: Win XP SP1, JDK v1.5.0_02, Xerces > v2.6.2 (manually used; overrides any other, if packaged with the JDK) > Reporter: Darien Kindlund > Priority: Minor > Attachments: RegexParser.diff, regexparser.java > > > Xerces rejects the following schema: > <xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'> > <xs:element name="test"> > <xs:simpleType> > <xs:restriction base="xs:string"> > <xs:pattern value="$?[0-9]+\.[0-9]{2}" /> > </xs:restriction> > </xs:simpleType> > </xs:element> > </xs:schema> > The code within org.apache.xerces.impl.xpath.regex.RegexParser throws a > parser exception over the use of the "$?" characters, unless the "$" > character is escaped. For example, this works: > <xs:pattern value="\$?[0-9]+\.[0-9]{2}" /> > The fundamental problem is that the Xerces RegexParser code does NOT follow > the XML Schema specification, as defined by this URL: > http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#dt-metac > Specifically, the XML Schema specification does NOT give special meaning to > the "$" and "^" characters, whereas the RegexParser code seems to indicate > that these characters have the normal, standard UNIX definitions of > "end-of-line" and "start-of-line" anchors respectively. > Regards, > -- > Darien Kindlund > The MITRE Corporation > InfoSec Engr / Scientist, Sr. > [EMAIL PROTECTED] -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
