> > It has to have a row on it, right?
Only the column name is the key in the bloom. For explicit columnar scan only. StoreFileScanner can be skipped after this bloom check. Only a high level thinking here. No? It does't work this way? I must miss something then. > And how do we get space savings? The number of columns would be much less than the ROW+COL > There is a bloom at the start of every row already, to speed deletes. IIRC, > we always read this first before we do anything. Perhaps we could beef it > up with more than just delete? > Have seen something like that in the code. Still trying to better understand it. > > St.Ack > > > > > Jerry > > > > On Thu, Dec 3, 2015 at 9:01 AM, Stack <[email protected]> wrote: > > > > > On Wed, Dec 2, 2015 at 10:01 PM, Jerry He <[email protected]> wrote: > > > > > > > Thanks for the response. You got my question correctly. > > > > If we are scanning the rows one by one and we have the requested > column > > > in > > > > the column tracker, we have the row+column to look up in the bloom > > > filter, > > > > don't we? We may not be able to filter out the file scanners upfront. > > But > > > > may at the later time and lower level to skip something? > > > > > > > > > > > <I've not looked at the code>You are right. If more than one explicit > > > column specified, we could do a bloom check for the second and so on > > since > > > we'd have the current row to hand. It could make for a nice speedup for > > > scans of many explicit columns traversing a dataset that is sparsely > > > populated.</I've not looked at the code>. > > > > > > St.Ack > > > > > > > > > > > > > Jerry > > > > > > > > On Mon, Nov 30, 2015 at 10:55 PM, Stack <[email protected]> wrote: > > > > > > > > > On Mon, Nov 30, 2015 at 9:56 AM, Jerry He <[email protected]> > > wrote: > > > > > > > > > > > Hi, experts > > > > > > > > > > > > HBASE supports ROWCOL bloom filter. ROW+COL would be the bloom > key. > > > > > > In most of the documentations, it says only GET would benefit. > For > > > > > > multi-column as well. > > > > > > > > > > > > If I do scan with StartRow and EndRow, and also specify columns. > > > > > > Would ROWCOL bloom filter provide any benefit in anyway? > > > > > > > > > > > > > > > > > If I understand your question properly, the answer is no. While we > > > might > > > > > have a set of columns to check in the bloom, we'd not know the set > of > > > > rows > > > > > between start and end row and so would not be able to formulate a > > query > > > > > against the ROW+COL bloom filter. > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > Thank you. > > > > > > > > > > > > Jerry > > > > > > > > > > > > > > > > > > > > >
