'- I'd like to do the top N stuff on the server side to reduce traffic, will this be possible? '
Endpoint? 2012/6/5 Em <[email protected]> > Hello list, > > let's say I have to fetch a lot of rows for a page-request (say > 1.000-2.000). > The row-keys are a composition of a fixed id of an object and a > sequential ever-increasing id. Salting those keys for balancing may be > taken into consideration. > > I want to do a Join like this one expressed in SQL: > > SELECT t1.columns FROM t1 > JOIN t2 ON (t1.id = t2.id) > WHERE t2.id = fixedID-prefix > > I know that HBase does not support that out of the box. > My approach is to have all the fixed-ids as columns of a row in t1. > Selecting a row, I fetch those columns that are of interest for me, > where each column contains a fixedID for t2. > Now I do a scan on t2 for each fixedID which should return me exactly > one value per fixedID (it's kind of a reverse-timestamp-approach like in > the HBase-book). > Furthermore I am really only interested in the key itself. I don't care > about the columns (t2 is more like an index). > Having fetched a row per fixedID, I sort based on the sequential part of > their key and get the top N. > For those top N I'll fetch data from t1. > > The usecase is to fetch the top N most recent entitys of t1 that are > associated with a specific entity in t1 by using t2 as an index. > T2 has one extra benefit over t1: You can do range-scans, if neccessary. > > Questions: > - since this is triggered by a page-request: Will this return with low > latency? > - is there a possibility to do those Scans in a batch? Maybe I can > combine them into one big scanner, using a custom filter for what I want? > - do you have thoughts on improving this type of request? > - I'd like to do the top N stuff on the server side to reduce traffic, > will this be possible? > - I am not sure whether a Scan is really what I want. Maybe a Multiget > will fit my needs better combined with a RowFilter? > > > I really work hard on finding the best approach of mapping this > m:n-relation to a HBase schema - so any help is appreciated. > > Please note: I haven't written any line of HBase code so far. Currently > I am studying books, blog-posts, slides and the mailinglists for > learning more about HBase. > > Thanks! > > Kind regards, > Em >
