Hello folks,
I saw a lot of questions about SingleColumnValueFilter but I have one more.
The question is about setFilterIfMissing value and it's behaviour with
other column families.
I have the following in my db:
hbase(main):004:0> scan 'c59d09d425244b9bb216a229c2441819_resource'
ROW
COLUMN+CELL
resource-id column=f:project_id,
timestamp=1397138905401,
value="project-id"
resource-id column=f:s_test-1,
timestamp=1397138905401,
value="1"
resource-id
column=m:9222030811254775807+test-1+instance!cumulative!,
timestamp=1397138905401, value={"$date":
1341225600000}
resource-id
column=m:9222030811314775807+test-1+instance!cumulative!,
timestamp=1397138905377, value={"$date":
1341225540000}
resource-id-2 column=f:project_id,
timestamp=1397138905422,
value="project-id-2"
resource-id-2 column=f:s_test,
timestamp=1397138905422,
value="1"
resource-id-2
column=m:9222030811134775807+test+instance!cumulative!,
timestamp=1397138905422, value={"$date": 1341225720000}
After filter applying I see the following:
hbase(main):005:0> scan 'c59d09d425244b9bb216a229c2441819_resource',
{FILTER => "(SingleColumnValueFilter ('f', 's_test-1', =, 'binary:\"1\"',
true, false))"}
ROW
COLUMN+CELL
resource-id column=f:project_id,
timestamp=1397138905401,
value="project-id"
resource-id column=f:s_test-1,
timestamp=1397138905401,
value="1"
resource-id
column=m:9222030811254775807+test-1+instance!cumulative!,
timestamp=1397138905401, value={"$date":
1341225600000}
resource-id
column=m:9222030811314775807+test-1+instance!cumulative!,
timestamp=1397138905377, value={"$date":
1341225540000}
resource-id-2
column=m:9222030811134775807+test+instance!cumulative!,
timestamp=1397138905422, value={"$date": 1341225720000}
I wonder why I see resource-id-2 in output even with
setFilterIfMissing==True? Row with id 'resource-id-2' doesn't contain
"f:s_test-1", it contains only "f:s_test". From docs about
setFilterIfMissing: "If true, the entire row will be skipped if the column
is not found."
So column 's_test-1' is not found in resource-id-2. But I still see this
row (but only 'm' CF) in output.
Of course I can determine COLUMNS=['f'] and resource-id-2 will not be
shown. But I can't because I need values from 'm'.
Could you please comment this behaviour?
Thanks,
Nadya
(From OpenStack Ceilometer team)