Alexandre Normand created HBASE-10966:
-----------------------------------------
Summary: RowCounter misinterprets column names that have colons in
their qualifier
Key: HBASE-10966
URL: https://issues.apache.org/jira/browse/HBASE-10966
Project: HBase
Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Alexandre Normand
Assignee: Alexandre Normand
Priority: Trivial
Attachments: HBASE-10966-1.patch
RowCounter allows for column names to be specify at command line:
{code}
Usage: RowCounter [options] <tablename> [--range=[startKey],[endKey]]
[<column1> <column2>...]
For performance consider the following options:
-Dhbase.client.scanner.caching=100
-Dmapred.map.tasks.speculative.execution=false
{code}
However, the column names are parsed assuming that if there is a colon, there
are only two parts to the string. In other words, it assumes
{{family:qualifier}} where {{qualifier}} wouldn't contain a colon.
This came up as I was trying to do a row count on a {{kiji}} table where
qualifiers typically have multiple colon-delimited components (i.e. {{B:C}}
could be a qualifier in the {{B}} family).
The flaw is in this code:
{code}
String [] fields = columnName.split(":");
if(fields.length == 1) {
scan.addFamily(Bytes.toBytes(fields[0]));
} else {
byte[] qualifier = Bytes.toBytes(fields[1]);
qualifiers.add(qualifier);
scan.addColumn(Bytes.toBytes(fields[0]), qualifier);
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)