[jira] Commented: (XERCESC-1816) Multi-character escape classes don't work correctly in regular expressions

David Bertoni (JIRA) Mon, 14 Jul 2008 21:26:36 -0700

    [ 
https://issues.apache.org/jira/browse/XERCESC-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613521#action_12613521
 ]


David Bertoni commented on XERCESC-1816:
----------------------------------------

http://perldoc.perl.org/perlre.html#Regular-Expressions

According to the spec, \c supports matching a single control code, which 
explains the existing code.

\C supports matching a single byte, even in Unicode mode:

" \C         Match a single C char (octet) even under Unicode.
             NOTE: breaks up characters into their UTF-8 bytes,
             so you may end up with malformed pieces of UTF-8.
             Unsupported in lookbehind."


Why don't we just report an error if the expression contains \i or \I in 
non-schema mode.

"The escape sequence '{0}' is supported only in XML Schema mode."

> Multi-character escape classes don't work correctly in regular expressions
> --------------------------------------------------------------------------
>
>                 Key: XERCESC-1816
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1816
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Validating Parser (XML Schema)
>    Affects Versions: 2.8.0, 3.0.0
>            Reporter: John Snelson
>
> The regular expressions "\i", "\I", "\c" and "\C" do not work as specified in 
> the XML Schema specification:
> http://www.w3.org/TR/xmlschema-2/#nt-MultiCharEsc
> In fact, "\I" and "\C" cause an infinite loop during the parsing of the 
> regular expression, "\i" seems to only match the letter "i", and "\c" gives 
> the error:
> A character in U+0040-U+005f must follow '\c'.
> I'd be happy to attempt to fix this bug, but I need some guidance as to what 
> the code for "\c" is actually meant to be doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (XERCESC-1816) Multi-character escape classes don't work correctly in regular expressions

Reply via email to