[ https://issues.apache.org/jira/browse/HBASE-17461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821818#comment-15821818 ]
Wellington Chevreuil edited comment on HBASE-17461 at 1/13/17 2:06 PM: ----------------------------------------------------------------------- It doesn't, still shows same error message because of the same behaviour. Since it finds no region with the changed name, it then tries to run major_compact on a table. The error printed in the shell relates to the table naming validation checks, but it isn't the real issue here. Putting additional logging message on a test version of HBaseAdmin, I could see the java equivalent string gets converted to a different one: {noformat} major_compact "test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1481745228583.b4bc69356d89018bfad3ee106b717285." ... INFO client.HBaseAdmin: Invalid region: test,\xEF\xBF\xBDB2!$\xEF\xBF\xBD\x0A\xEF\xBF\xBDG\xEF\xBF\xBD\xEF\xBF\xBD\x1B\xEF\xBF\xBD\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. ERROR: Illegal character code:44, <,> at 4. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: test,�B2!$� �G���1481745228583.b4bc69356d89018bfad3ee106b717285. ... {noformat} Noticed that the real region name: test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. Was being logged with the value below, at the java side: test,\xEF\xBF\xBDB2!$\xEF\xBF\xBD\x0A\xEF\xBF\xBDG\xEF\xBF\xBD\xEF\xBF\xBD\x1B\xEF\xBF\xBD\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. Then, realized the only characters being changed are the ones which ASC code higher than 127, which in java should be negative numbers. Tried add the mentioned change on *major_compact.rb* file, and it has fixed it. was (Author: wchevreuil): It doesn't, still shows same error message because of the same behaviour. Since it finds no region with the changed name, it then tries to run major_compact on a table. The error printed in the shell relates to the table naming validation checks, but it isn't the real issue here. Putting additional logging message on a test version of HBaseAdmin, I could see the java equivalent string gets converted to a different one: {noformat} major_compact "test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1481745228583.b4bc69356d89018bfad3ee106b717285." ... INFO client.HBaseAdmin: Invalid region: test,\xEF\xBF\xBDB2!$\xEF\xBF\xBD\x0A\xEF\xBF\xBDG\xEF\xBF\xBD\xEF\xBF\xBD\x1B\xEF\xBF\xBD\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. ERROR: Illegal character code:44, <,> at 4. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: test,�B2!$� �G���1481745228583.b4bc69356d89018bfad3ee106b717285. ... {noformat} Noticed that the real region name: test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. Was being logged with the value below, at the java side: test,\xEF\xBF\xBDB2!$\xEF\xBF\xBD\x0A\xEF\xBF\xBDG\xEF\xBF\xBD\xEF\xBF\xBD\x1B\xEF\xBF\xBD\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. Then, realized the only characters being changed are the ones which ASC code os higher than 127, which in java should be negative numbers. Tired add the mentioned change on *major_compact.rb* file, and it has fixed it. > HBase shell "major_compact" command should properly convert > "table_or_region_name" parameter to java byte array properly before simply > calling "HBaseAdmin.majorCompact" method > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-17461 > URL: https://issues.apache.org/jira/browse/HBASE-17461 > Project: HBase > Issue Type: Bug > Components: shell > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > > On HBase shell, *major_compact* command simply passes the received > *table_or_region_name* parameter straight to java *HBaseAdmin.majorCompact* > method. > On some corner cases, HBase tables row keys may have special characters. > Then, if a region is split in such a way that row keys with special > characters are now part of the region name, calling *major_compact* on this > regions will fail, if the special character ASCII code is higher than 127. > This happens because Java byte type is signed, while ruby byte type isn't, > causing the region name to be converted to a wrong string at Java side. > For example, considering a region named as below: > {noformat} > test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1481745228583.b4bc69356d89018bfad3ee106b717285. > {noformat} > Calling major_compat on it fails as follows: > {noformat} > hbase(main):008:0* major_compact > "test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1484177359169.8128fa75ae0cd4eba38da2667ac8ec98." > ERROR: Illegal character code:44, <,> at 4. User-space table qualifiers can > only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: test,�B2!$� > �G���1484177359169.8128fa75ae0cd4eba38da2667ac8ec98. > {noformat} > An easy solution is to convert *table_or_region_name* parameter properly, > prior to calling *HBaseAdmin.majorCompact* in the same way as it's already > done on some other shell commands, such as *get*: > {noformat} > admin.major_compact(table_or_region_name.to_s.to_java_bytes, family) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)