------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1336 --- Comment #9 from Philip Hazel <[email protected]> 2013-02-26 16:00:34 --- I have now done some research on Universal Character Names. It seems that what you are asking for is a way of matching "any character that may be encoded using a Universal Character Name" rather than a Universal Character Name itself, which is of the form \uxxxx (for characters whose Unicode code point is no greater than U+FFFF) or \Uxxxxxxxx others. We do already have some "private" Unicode property names in PCRE, for example, Xan for any Unicode alphanumeric character. They all begin with the letter X. I propose to add Xuc ("universally-named character", keeping it down to 3 letters) which will match $ @ ` and all characters from \x{a0} upwards except for the excluded range \x{d800} to \x{dfff}. These are the only characters that are permitted to be specified using Universal Character Names. Most "base characters" such as ASCII letters are not permitted. The PCRE syntax will therefore be \p{Xuc}. To match the same set, but without $ @ and ` you should be able to use the double negative trick: [^\P{Xuc}$@`] (compare [^\W_] which matches letters and digits but not underscore). -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
