Did anyone grep the source to see if any other classes are using ByteArrayBackedCharSequence? Should that class be removed or fixed?
On Fri, Dec 9, 2011 at 12:25 PM, Billie Rinaldi (Updated) (JIRA) < [email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/ACCUMULO-209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] > > Billie Rinaldi updated ACCUMULO-209: > ------------------------------------ > > Resolution: Fixed > Status: Resolved (was: Patch Available) > > > RegExFilter does not properly regex when using multi-byte characters > > -------------------------------------------------------------------- > > > > Key: ACCUMULO-209 > > URL: https://issues.apache.org/jira/browse/ACCUMULO-209 > > Project: Accumulo > > Issue Type: Bug > > Components: client > > Affects Versions: 1.3.5 > > Reporter: Jim Klucar > > Assignee: Billie Rinaldi > > Fix For: 1.4.0, 1.5.0 > > > > Attachments: accumulo-209-RegExFilter.patch, > accumulo-209-RegExFilterTest.patch, accumulo-209.patch > > > > Original Estimate: 1h > > Remaining Estimate: 1h > > > > The current RegExFilter class uses a ByteArrayBackedCharSequence to set > the data to match against. The ByteArrayBackedCharSequence contains a line > of code that prevents the matcher from properly matching multi-byte > characters. > > Line 49 of ByteArrayBackedCharSequence.java is: > > return (char) (0xff & data[offset + index]); > > This incorrectly casts a single byte from the byte array to a char, > which is 2 bytes in Java. This prevents the RegExFilter from properly > performing Regular Expressions on multi-byte character encoded values. > > A patch for the RegExFilter.java file has been created and will be > submitted. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >
