> This is a fix in the buggy way CIBackRef traverses unicode characters that 
> could be variable-length. Originally it followed the approach that BackRef 
> does, but failed to account for unicode characters that could be 2 
> chars-long. The upper bound (groupSize) for the traversing loop is set by the 
> difference between group start and stop indexes. This works for single char 
> characters and it also works for case-sensitive comparisons because 
> byte-by-byte comparisons are acceptable, but it doesn't work for a comparison 
> where some kind of normalization (i.e. case) is required. This fix adjusts 
> the upper bound for the loop that traverses the character when a two-char 
> character is encountered.
> 
> An alternative was to check the length of the group size by scanning the 
> group in advance and converting to code points, but this could potentially 
> result in multiple scans and codepoint conversions of the same matcher group 
> which could be long. The solution that adjusts the loop bounds on the fly 
> avoids this case.

Ian Graves has updated the pull request incrementally with one additional 
commit since the last revision:

  Removing increment variable and some other tweaks

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7501/files
  - new: https://git.openjdk.java.net/jdk/pull/7501/files/28d80c80..c4e5343e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7501&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7501&range=00-01

  Stats: 9 lines in 1 file changed: 0 ins; 1 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7501.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7501/head:pull/7501

PR: https://git.openjdk.java.net/jdk/pull/7501

Reply via email to