Hi I am developing a MapReduce task which operates on a very big HBase table. Each time it is run across a relatively small subset of this table although. I have read that HBase stores the rows in an alphabetical order. Rows interesing for a particular MapReduce task always form a consistent areas within the table saved in such order. Using a binary search on the entire table would give a desirable performance so I suppose, there must exist some mechanism to achieve this.
I am thinking, how can I provide the input for such task. I have found a RowFilters mechanism. What is going to be performance of a solution using it ? I want to have only the rows having the keys starting witrh a given prefix. I want to avoid scanning the entire table, which filters seem to do. Do you have any suggestions ? I was trying to find some inormation in the archives of this mailing list, since it seems to be quite an obvious problem. Although I have not found anything. Thank You Piotr
