Server-side Row-level Inverted Index Join via Coprocessors
----------------------------------------------------------
Key: HBASE-3342
URL: https://issues.apache.org/jira/browse/HBASE-3342
Project: HBase
Issue Type: New Feature
Reporter: Jonathan Gray
A common schema in HBase is to created an inverted index per row (a la inbox
search) where a row is a user/entity, each column is a word, and versions are
instances of that word in documents (values can be empty or could contain
additional scoring info like position / count information).
When querying indexes like this, we may want to do something like: give me the
N most recent documents that contain the word "foo" (exact word matching) and
contain a word that starts with "bar" (prefix matching).
Currently this join has to be done on the client-side, so we may have to read
far more than N documents for each word to be able to get N documents which
match for both words. This gets worse as the number of words increase.
We could implement this join on the server-side in a coprocessor.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.