[ 
https://issues.apache.org/jira/browse/HBASE-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10992:
-------------------------------

    Description: 
We find getRowNum(...) of AggregateImplementation may return error count when 
users set qualifiers to the parameter scan. The reason might be the following 
code in AggregateImplementation#getRowNum:
{code}
 byte[] colFamily = scan.getFamilies()[0];
 byte[] qualifier = scan.getFamilyMap().get(colFamily).pollFirst();
 ...
{code}
In above code, the method will remove the first user-set qualifier from scan. 
Then, the following scan will return all rows containing data belong to the 
family, which might lead the miscounting. For example, if we put data for 
family='A' and qualifier='a' and count the row by setting family='A' and 
qualifier='b' in scan, we will get unexpected count.
In my opinion, it is better to only count rows containing the qualifiers of 
family set by user if we allow users to add qualifiers to scan; otherwise,  it 
might be better to prevent users from adding qualifiers to scan of 
AggregationClient#rowCount. 

  was:
We find getRowNum(...) of AggregateImplementation may return error count when 
users set qualifiers to the parameter scan. The reason might be the following 
code in AggregateImplementation#getRowNum:
{code}
 byte[] colFamily = scan.getFamilies()[0];
 byte[] qualifier = scan.getFamilyMap().get(colFamily).pollFirst();
 ...
{/code}
In above code, the method will remove the first user-set qualifier from scan. 
Then, the following scan will return all rows containing data belong to the 
family, which might lead the miscounting. For example, if we put data for 
family='A' and qualifier='a' and count the row by setting family='A' and 
qualifier='b' in scan, we will get unexpected count.
In my opinion, it is better to only count rows containing the qualifiers of 
family set by user if we allow users to add qualifiers to scan; otherwise,  it 
might be better to prevent users from adding qualifiers to scan of 
AggregationClient#rowCount. 


> getRowNum of coprocessor may return error count when users set qualifiers
> -------------------------------------------------------------------------
>
>                 Key: HBASE-10992
>                 URL: https://issues.apache.org/jira/browse/HBASE-10992
>             Project: HBase
>          Issue Type: Bug
>          Components: Coprocessors
>    Affects Versions: 0.94.18
>            Reporter: cuijianwei
>            Priority: Minor
>
> We find getRowNum(...) of AggregateImplementation may return error count when 
> users set qualifiers to the parameter scan. The reason might be the following 
> code in AggregateImplementation#getRowNum:
> {code}
>  byte[] colFamily = scan.getFamilies()[0];
>  byte[] qualifier = scan.getFamilyMap().get(colFamily).pollFirst();
>  ...
> {code}
> In above code, the method will remove the first user-set qualifier from scan. 
> Then, the following scan will return all rows containing data belong to the 
> family, which might lead the miscounting. For example, if we put data for 
> family='A' and qualifier='a' and count the row by setting family='A' and 
> qualifier='b' in scan, we will get unexpected count.
> In my opinion, it is better to only count rows containing the qualifiers of 
> family set by user if we allow users to add qualifiers to scan; otherwise,  
> it might be better to prevent users from adding qualifiers to scan of 
> AggregationClient#rowCount. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to