[ 
https://issues.apache.org/jira/browse/HBASE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793144#comment-13793144
 ] 

Aditya Kishore commented on HBASE-9747:
---------------------------------------

Actually, I am surprised by the third scan result

{quote}
hbase(main):002:0> scan 't1',
\{FILTER => "SingleColumnValueFilter('f1', 'q1', =, 'binary:113')"}

ROW COLUMN+CELL
c1 column=f1:q1, timestamp=1381469178679, value=113
1 row(s) in 0.0140 seconds
{quote}

This should have returned two rows 
{noformat}
a1 column=f1:q2, timestamp=1381468905492, value=111
c1 column=f1:q1, timestamp=1381468905549, value=113
{noformat}

The {{SingleColumnValueFilter}}, by default does not filter out the rows in 
which the specified column does not exist ('a1', in your case). So it will let 
this row returned for the scan. For the same scan I get this result.

{noformat}
hbase(main):010:0> scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'q1', 
=, 'binary:113')"}
ROW                                                          COLUMN+CELL
 a1                                                          column=f1:q2, 
timestamp=1381528316466, value=111
 c1                                                          column=f1:q1, 
timestamp=1381528324693, value=113
2 row(s) in 0.0210 seconds
{noformat}

If you want to drop the rows which does not include the column, you need to 
call {{SingleColumnValueFilter.setFilterIfMissing(true)}}, from the shell you 
can invoke it this way.
{noformat}
scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'q1', =, 'binary:113', 
true, false)"}
{noformat}


> PrefixFilter with OR condition gives wrong results
> --------------------------------------------------
>
>                 Key: HBASE-9747
>                 URL: https://issues.apache.org/jira/browse/HBASE-9747
>             Project: HBase
>          Issue Type: Bug
>          Components: Filters
>    Affects Versions: 0.94.9
>            Reporter: Deepa Remesh
>
> PrefixFilter when used with a SingleColumnValueFilter with an OR condition 
> gives wrong results. In below example, each filter when evaluated separately 
> gives 1 row each. The OR condition with the two filters gives 3 rows instead 
> of 2. Repro below:
> create 't1', 'f1'
> put 't1','a1','f1:q2','111'
> put 't1','b1','f1:q1','112'
> put 't1','c1','f1:q1','113'
> hbase(main):020:0> scan 't1', {FILTER => "PrefixFilter ('b') OR 
> SingleColumnValueFilter('f1', 'q1', =, 'binary:113')"}
> ROW                                                COLUMN+CELL
>  a1                                                column=f1:q2, 
> timestamp=1381468905492, value=111
>  b1                                                column=f1:q1, 
> timestamp=1381468905518, value=112
>  c1                                                column=f1:q1, 
> timestamp=1381468905549, value=113
> 3 row(s) in 0.1020 seconds
> hbase(main):021:0> scan 't1', {FILTER => "PrefixFilter ('b')"}
> ROW                                                COLUMN+CELL
>  b1                                                column=f1:q1, 
> timestamp=1381468905518, value=112
> 1 row(s) in 0.0150 seconds
> hbase(main):002:0> scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'q1', 
> =, 'binary:113')"}
> ROW                                                COLUMN+CELL
>  c1                                                column=f1:q1, 
> timestamp=1381469178679, value=113
> 1 row(s) in 0.0140 seconds



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to