[
https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aditya Kishore updated HBASE-6991:
----------------------------------
Fix Version/s: 0.96.0
Hadoop Flags: Incompatible change
Status: Patch Available (was: Open)
The patch include the following changes:
1. Gets rid of unnecessary byte[] to String conversion. The "ISO-8859-1"
charset does not do any transformation anyway. This also does away with the
need of try-catch block.
{code}
- String first = new String(b, off, len, "ISO-8859-1");
- for (int i = 0; i < first.length() ; ++i ) {
- int ch = first.charAt(i) & 0xFF;
+ for (int i = off; i < off + len ; ++i ) {
+ int ch = b[i] & 0xFF;
{code}
2. Removed "\" from the set of printable non-alphanumeric characters so that it
can be escaped using the "\xXX" format.
{code}
- || " `~!@#$%^&*()-_=+[]{}\\|;:'\",.<>/?".indexOf(ch) >= 0 ) {
+ || " `~!@#$%^&*()-_=+[]{}|;:'\",.<>/?".indexOf(ch) >= 0 ) {
{code}
3. Added new test case to verify that the conversion is reversible for random
array of bytes. Without this change the test always fails. The test add 1 extra
second to the test run.
{code:title=hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java}
+ public void testToStringBytesBinaryReversible() {
+ // let's run test with 1000 randomly generated byte arrays
+ Random rand = new Random(System.currentTimeMillis());
+ byte[] randomBytes = new byte[1000];
+ for (int i = 0; i < 1000; i++) {
+ rand.nextBytes(randomBytes);
+ verifyReversibleForBytes(randomBytes);
+ }
+
+ // some specific cases
+ verifyReversibleForBytes(new byte[] {});
+ verifyReversibleForBytes(new byte[] {'\\', 'x', 'A', 'D'});
+ verifyReversibleForBytes(new byte[] {'\\', 'x', 'A', 'D', '\\'});
+ }
+
+ private void verifyReversibleForBytes(byte[] originalBytes) {
+ String convertedString = Bytes.toStringBinary(originalBytes);
+ byte[] convertedBytes = Bytes.toBytesBinary(convertedString);
+ if (Bytes.compareTo(originalBytes, convertedBytes) != 0) {
+ fail("Not reversible for\nbyte[]: " + Arrays.toString(originalBytes) +
+ ",\nStringBinary: " + convertedString);
+ }
+ }
{code}
4. And finally, fixes the two test cases which were breaking because they
assumed that "\" is encoded as "\".
{code}
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
- + "\\xD46\\xEA5\\xEA3\\xEA7\\xE7\\x00LI\\s\\xA0\\x0F\\x00\\x00"
+ + "\\xD46\\xEA5\\xEA3\\xEA7\\xE7\\x00LI\\x5Cs\\xA0\\x0F\\x00\\x00"
{code}
Setting the "Incompatible change" flag since any other code which makes the
same assumption as the two test cases needs fix.
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
> Key: HBASE-6991
> URL: https://issues.apache.org/jira/browse/HBASE-6991
> Project: HBase
> Issue Type: Bug
> Components: util
> Affects Versions: 0.96.0
> Reporter: Aditya Kishore
> Assignee: Aditya Kishore
> Fix For: 0.96.0
>
> Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as
> special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
> byte[] original = {
> '\\', 'x', 'A', 'D'
> };
> String stringFromBytes = Bytes.toStringBinary(original);
> byte[] converted = Bytes.toBytesBinary(stringFromBytes);
> System.out.println("Original: " + Arrays.toString(original));
> System.out.println("Converted: " + Arrays.toString(converted));
> System.out.println("Reversible?: " + (Bytes.compareTo(original, converted)
> == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a
> non-printable character ("\x5C") to avoid any kind of unambiguity during
> conversion.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira