You could also do this with MR easily using Pig's HBaseStorage and either an inner join or an outer join with a filter on null, depending on if you want matches or misses, respectively.
On Fri, Mar 11, 2011 at 4:25 PM, Usman Waheed <usm...@opera.com> wrote: > I suggest it to be ROWCOL because you have many columns to match against in > your second table (column qualifiers). > > -Usman > >> Should the Bloom filter be ROW or ROWCOL? >> >> Vishal >> >> On Fri, Mar 11, 2011 at 11:44 AM, Lars George <lars.geo...@gmail.com> >> wrote: >> >>> Hi, >>> >>> If you expect a lot of misses with that approach then enable bloom >>> filters >>> on the second table for fast lookups of misses. >>> >>> Lars >>> >>> On Mar 11, 2011, at 9:44, Amandeep Khurana <ama...@gmail.com> wrote: >>> >>> > You can scan through one table and see if the other one has those >>> > rowids >>> or >>> > not. >>> > >>> > On Thu, Mar 10, 2011 at 8:08 PM, Vishal Kapoor >>> > <vishal.kapoor...@gmail.com>wrote: >>> > >>> >> Friends, >>> >> how do I best achieve intersection of sets of row ids >>> >> suppose I have two tables with similar row ids >>> >> how can I get the row ids present in one and not in the other? >>> >> does things get better if I have row ids as values in some qualifier/ >>> >> qualifier itself? >>> >> I hope the question is not too confusing... >>> >> >>> >> intersection of {1, 2, 3} and {2, 3, 4} is {2, 3}. >>> >> while {1,2,3} are row ids from a table, {2,3,4} may come from other >>> table >>> >> as >>> >> qualifiers in some row. >>> >> >>> >> thanks, >>> >> Vishal >>> >> >>> > > > -- > Using Opera's revolutionary email client: http://www.opera.com/mail/ >