PLEASE DO NOT REPLY TO THIS MESSAGE. TO FURTHER COMMENT
ON THE STATUS OF THIS BUG PLEASE FOLLOW THE LINK BELOW
AND USE THE ON-LINE APPLICATION. REPLYING TO THIS MESSAGE
DOES NOT UPDATE THE DATABASE, AND SO YOUR COMMENT WILL
BE LOST SOMEWHERE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=3273

*** shadow/3273 Sat Aug 25 23:38:56 2001
--- shadow/3273.tmp.12103       Sat Aug 25 23:38:56 2001
***************
*** 0 ****
--- 1,57 ----
+ +============================================================================+
+ | CharacterArrayCharacterIterator substring function returns incorrect resul |
+ +----------------------------------------------------------------------------+
+ |        Bug #: 3273                        Product: Regexp                  |
+ |       Status: NEW                         Version: unspecified             |
+ |   Resolution:                            Platform: All                     |
+ |     Severity: Normal                   OS/Version: All                     |
+ |     Priority: Other                     Component: Other                   |
+ +----------------------------------------------------------------------------+
+ |  Assigned To: [EMAIL PROTECTED]                                |
+ |  Reported By: [EMAIL PROTECTED]                                     |
+ |      CC list: Cc:                                                          |
+ +----------------------------------------------------------------------------+
+ |          URL: .../api/org/apache/regexp/CharacterArrayCharacterIterator.ht |
+ +============================================================================+
+ |                              DESCRIPTION                                   |
+ Using the RE.match(CharacterIterator,int) function
+ with a "CharacterArrayCharacterIterator", then calling "getParen(int)"
+ often returns a string of the incorrect length, or throws an exception.
+ 
+ This is due to the implementation of "substring(int,int)" in the
+ CharacterArrayCharacterIterator class and/or the mis-documentation of
+ the CharacterIterator.substring interface.
+ 
+ The confusion is in whether the second argument to substring represents
+ the endIndex or the length. The API docs say it's the length, but the
+ RE implementation, and the StringCharacterIterator implementation both
+ treat it as the endIndex.
+ [Note, the standard java string has,
+ java.lang.String.substring(int beginIndex, int endIndex)
+ but the constructor is java.lang.String(char[] src, int off, int len)]
+ 
+ Secondly, there is no check that the requested substring stays within the
+ bounds of the sequence length specified at construction time.
+ An IndexOutOfBoundsException should be thrown in that case.
+ 
+ I think the best solution is to first update the API docs to specify
+ that it is infact (beginIndex, endIndex), and then to update the 
+ CharacterArrayCharacterIterator.substring functions to be something like this:
+ 
+  public String substring(int beginIndex, int endIndex)
+  {
+    if (endIndex > len)
+      throw new IndexOutOfBoundsException("endIndex=" + endIndex +
+       "; sequence size=" + len);
+    if (beginIndex < 0)
+      throw new IndexOutOfBoundsException("beginIndex=" + beginIndex);
+    return new String(src, off + beginIndex, endIndex - beginIndex);
+  }
+ 
+  public String substring(int beginIndex)
+  {
+    if (beginIndex > len)
+      throw new IndexOutOfBoundsException("index=" + beginIndex +
+       "; sequence size=" + len);
+    return new String(src, off + beginIndex, len - beginIndex);
+  }

Reply via email to