PLEASE DO NOT REPLY TO THIS MESSAGE. TO FURTHER COMMENT
ON THE STATUS OF THIS BUG PLEASE FOLLOW THE LINK BELOW
AND USE THE ON-LINE APPLICATION. REPLYING TO THIS MESSAGE
DOES NOT UPDATE THE DATABASE, AND SO YOUR COMMENT WILL
BE LOST SOMEWHERE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=3303

*** shadow/3303 Tue Aug 28 06:17:49 2001
--- shadow/3303.tmp.7127        Tue Aug 28 06:17:49 2001
***************
*** 0 ****
--- 1,44 ----
+ +============================================================================+
+ | Unicode 3.0 character \\uFFFD                                              |
+ +----------------------------------------------------------------------------+
+ |        Bug #: 3303                        Product: Regexp                  |
+ |       Status: NEW                         Version: unspecified             |
+ |   Resolution:                            Platform: PC                      |
+ |     Severity: Minor                    OS/Version: Windows NT/2K           |
+ |     Priority: Other                     Component: Other                   |
+ +----------------------------------------------------------------------------+
+ |  Assigned To: [EMAIL PROTECTED]                                |
+ |  Reported By: [EMAIL PROTECTED]                                     |
+ |      CC list: Cc:                                                          |
+ +----------------------------------------------------------------------------+
+ |          URL:                                                              |
+ +============================================================================+
+ |                              DESCRIPTION                                   |
+ http://www.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.txt:
+ >FFFD;REPLACEMENT CHARACTER;So;0;ON;;;;;N;;;;;
+ 
+ For some reason when the above character is in any regex character class it 
+ causes a RESyntaxException with description 'Bad Character Class'. I attempted 
+ to use it in the following context:
+ 
+   private static String XMLescape(String s)
+       throws RESyntaxException
+   {
+       if (s==null) return s;
+       if (s.length() == 0) return s;
+ 
+       // XML 1.0 standard actually says:
+       // Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | 
+ [#x10000-10FFFF]
+       // For some reason this library doesn't like the Unicode character 
+ \\uFFFD.
+       RE r = new RE("[^\\u0009\\u0010\\u0013\\u0020-\\uD7FF\\uE000-\\uFFFC]");
+ 
+       return r.subst(s, "");
+   }
+ 
+ I'm using the JRE Standard Edition 3.0.
+ 
+ Regards,
+ 
+ Tasuki.

Reply via email to