Alexandre Normand created HBASE-10966:
-----------------------------------------

             Summary: RowCounter misinterprets column names that have colons in 
their qualifier
                 Key: HBASE-10966
                 URL: https://issues.apache.org/jira/browse/HBASE-10966
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.99.0
            Reporter: Alexandre Normand
            Assignee: Alexandre Normand
            Priority: Trivial
         Attachments: HBASE-10966-1.patch

RowCounter allows for column names to be specify at command line:

{code}
Usage: RowCounter [options] <tablename> [--range=[startKey],[endKey]] 
[<column1> <column2>...]
For performance consider the following options:
-Dhbase.client.scanner.caching=100
-Dmapred.map.tasks.speculative.execution=false
{code}

However, the column names are parsed assuming that if there is a colon, there 
are only two parts to the string. In other words, it assumes 
{{family:qualifier}} where {{qualifier}} wouldn't contain a colon. 

This came up as I was trying to do a row count on a {{kiji}} table where 
qualifiers typically have multiple colon-delimited components (i.e. {{B:C}} 
could be a qualifier in the {{B}} family).

The flaw is in this code:
{code}
       String  [] fields = columnName.split(":");
        if(fields.length == 1) {
          scan.addFamily(Bytes.toBytes(fields[0]));
        } else {
          byte[] qualifier = Bytes.toBytes(fields[1]);
          qualifiers.add(qualifier);
          scan.addColumn(Bytes.toBytes(fields[0]), qualifier); 
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to