Then you need to use Ted's approach... Because with the 2 filters you listed, you can not pass all as you said in your first message.
You might be able to merge your other filters into the RegEx? JM 2014-06-06 17:17 GMT-04:00 Vrushali C <[email protected]>: > Thanks for the discussion! This helps me understand these filters better. > > FWIW, I need to have a MUST_PASS_ALL since I have some other filters as > well in this scan. > > > On Friday, June 6, 2014 9:18 AM, Ted Yu <[email protected]> wrote: > > > bq. to do some test to compare the 2 solutions against the dataset. > > We're on the same page, JMS. > > > On Fri, Jun 6, 2014 at 5:00 AM, Jean-Marc Spaggiari < > [email protected] > > wrote: > > > Yep, it's exactly my point. In one case we call 2 binary comparator (very > > fast) in another case with call one regex comparator (slower). Now, > > depending of the size of the strings, the columns names, etc. One > solution > > might be faster than the other one. But I can not tell which one. And I > was > > just suggesting to do some test to compare the 2 solutions against the > > dataset. > > > > > > 2014-06-05 22:08 GMT-04:00 Ted Yu <[email protected]>: > > > > > For FilterList approach, a row where no qualifier starts with 'c!', > each > > > qualifier would go through both sub-filters. > > > > > > For RegexStringComparator, each qualifier in such row would be > evaluated > > > once - since prefix doesn't match, result is drawn quickly. > > > > > > Cheers > > > > > > > > > On Thu, Jun 5, 2014 at 5:33 PM, Jean-Marc Spaggiari < > > > [email protected] > > > > wrote: > > > > > > > I just re-used what Vrushali sent. I write that in the email so might > > not > > > > compile. But will give the idea. > > > > > > > > FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ONE); > > > > SingleColumnValueFilter filter1 = new > > > > QualifierFilter(CompareFilter.CompareOp.NOT_EQUAL, > > > > new BinaryPrefixComparator(Bytes.add(Bytes.toBytes("c!"), > > > > Constants.SEP_BYTES)))); > > > > > > > > list.add(filter1); > > > > > > > > SingleColumnValueFilter filter2 = new > > > > QualifierFilter(CompareFilter.CompareOp.EQUAL, > > > > > > > > new BinaryPrefixComparator(Bytes.add(Bytes.toBytes("c!someName"), > > > > Constants.SEP_BYTES)))) > > > > list.add(filter2); > > > > scan.setFilter(list); > > > > > > > > > > > > To pass the first, value should NOT be starting with c!. > > > > To pass the 2nd, value SHOULD start with c!someName. > > > > > > > > So c!notThis will fail for the first since it start with c!. and it > > will > > > > fail for the second since it's not starting with c!someName. > > > > > > > > Make sense? > > > > > > > > > > > > 2014-06-05 20:27 GMT-04:00 Ted Yu <[email protected]>: > > > > > > > > > If we test c!notThis first will give false, second too. We rejest. > > > > > If we test d!this first will give true, second false. We take it. > > > > > > > > > > Assuming the first filter compares against c!someName (negated), > why > > > > > would 'c!notThis' > > > > > give false ? > > > > > > > > > > Mind showing the definition of the FilterList ? > > > > > > > > > > Cheers > > > > > > > > > > > > > > > On Thu, Jun 5, 2014 at 4:52 PM, Jean-Marc Spaggiari < > > > > > [email protected] > > > > > > wrote: > > > > > > > > > > > He want to excluse everything starting with "c!" and keep > > c!someName. > > > > > > > > > > > > So. First filter is a NOT, second is a include. > > > > > > > > > > > > If we test c!notThis first will give false, second too. We > rejest. > > > > > > If we test d!this first will give true, second false. We take it. > > > > > > If we test c!someName first will give false, second will give > true. > > > We > > > > > take > > > > > > it. > > > > > > > > > > > > Do I miss something? It's possible because it's confusing ;) But > I > > > > think > > > > > it > > > > > > might work. > > > > > > > > > > > > JM > > > > > > > > > > > > > > > > > > 2014-06-05 19:47 GMT-04:00 Ted Yu <[email protected]>: > > > > > > > > > > > > > MUST_PASS_ONE represents boolean OR operator. > > > > > > > > > > > > > > According to Vrushali's description, "c!someName" should be > > > excluded. > > > > > > > > > > > > > > Would MUST_PASS_ONE achieve what Vrushali wanted ? > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > On Thu, Jun 5, 2014 at 4:33 PM, Jean-Marc Spaggiari < > > > > > > > [email protected] > > > > > > > > wrote: > > > > > > > > > > > > > > > I will still give a try to the 2 filters options. > > > > > > > > > > > > > > > > RegEx are nice and powerful but very expensive. It's non > > trivial. > > > > > While > > > > > > > the > > > > > > > > prefix comparator is pretty simple and fast. So I'm not sure > > > which > > > > of > > > > > > > the 2 > > > > > > > > options will be faster. > > > > > > > > > > > > > > > > My opinion: Code wise, RegEx will be simpler, 2 filters will > be > > > > > faster. > > > > > > > > > > > > > > > > > > > > > > > > 2014-06-05 18:55 GMT-04:00 Ted Yu <[email protected]>: > > > > > > > > > > > > > > > > > You're welcome. > > > > > > > > > > > > > > > > > > Filters / comparators shipped with HBase are pretty > powerful. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jun 5, 2014 at 3:04 PM, Vrushali C < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks Ted! Using that regex comparator helped me resolve > > > this. > > > > > > > > > Appreciate > > > > > > > > > > it very much! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thursday, June 5, 2014 2:23 PM, Ted Yu < > > > > [email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Or, you can use RegexStringComparator. > > > > > > > > > > > > > > > > > > > > Here is a regex string, in Java, that matches columns > with > > > > prefix > > > > > > c! > > > > > > > > > except > > > > > > > > > > column called c!someName : > > > > > > > > > > > > > > > > > > > > "^c\\!((?!someName).)*$" > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jun 5, 2014 at 1:26 PM, Ted Yu < > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > One option is to write your own Comparator (similar to > > > > > > > > > > BinaryPrefixComparator > > > > > > > > > > > in essence) that treats the known column name > specially. > > > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jun 5, 2014 at 12:52 PM, Vrushali C < > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Hi > > > > > > > > > > >> Is there a way to do this kind of filtering : In my > > scan, > > > I > > > > > want > > > > > > > to > > > > > > > > > > >> retrieve all columns except for columns starting with > a > > > > > certain > > > > > > > > > prefix. > > > > > > > > > > But > > > > > > > > > > >> within that set of columns being ignored, I have one > > known > > > > > > column > > > > > > > > name > > > > > > > > > > that > > > > > > > > > > >> I want to retrieve but ignore the rest. The reason is > > that > > > > > > columns > > > > > > > > > with > > > > > > > > > > >> this prefix have a lot of data and I am not interested > > in > > > > > > > everything > > > > > > > > > > EXCEPT > > > > > > > > > > >> one of those. > > > > > > > > > > >> > > > > > > > > > > >> So for ignoring the columns with a certain prefix in > the > > > > > scan, I > > > > > > > am > > > > > > > > > > doing > > > > > > > > > > >> something like > > > > > > > > > > >> filters.addFilter( > > > > > > > > > > >> new > > > QualifierFilter(CompareFilter.CompareOp.NOT_EQUAL, > > > > > > > > > > >> new BinaryPrefixComparator( > > > > > > > > > > >> Bytes.add(Bytes.toBytes("c!"), > > > > > > > > Constants.SEP_BYTES)))) > > > > > > > > > > >> > > > > > > > > > > >> Which works. But what I also want to add, is something > > > like > > > > > this > > > > > > > > > > >> > > > > > > > > > > >> filters.addFilter( > > > > > > > > > > >> new > QualifierFilter(CompareFilter.CompareOp.EQUAL, > > > > > > > > > > >> new BinaryPrefixComparator( > > > > > > > > > > >> Bytes.add(Bytes.toBytes("c!someName"), > > > > > > > > > > >> Constants.SEP_BYTES)))) > > > > > > > > > > >> > > > > > > > > > > >> I realize both filters are contradictory to each > other, > > so > > > > how > > > > > > do > > > > > > > I > > > > > > > > > > >> achieve this? > > > > > > > > > > >> > > > > > > > > > > >> thanks > > > > > > > > > > >> Vrushali > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
