It's not an actual hash or btree index, but rather secondary indexes in
HBase are implemented by creating an additional HBase table.
If I have a table "users" (row key is userid) with family "data" and
column "email", and I want to index the value in that column...
I can create a table "users_email" where the row key is the email
address (value from the column in "users" table) and a single column
that contains the userid.
Doing an "index lookup" would mean doing a get on "users_email" and then
using that userid to do a lookup on the "users" table.
IndexedTable does this transparently, but still does require two
queries. So it's slower than a single query, but certainly faster than
a full table scan.
If you need hash-level performance on the index lookup, there are lots
of solutions outside of HBase that would work... In-memory Java HashMap,
Tokyo Cabinet on-disk HashMaps, BerkeleyDB, etc... If you need full-text
indexing, you can use Lucene or the like.
Make sense?
JG
bharath vissapragada wrote:
But i have read somewhere that Secondary indexes are somewhat slow compared
to normal Hbase tables ..Does that effect the performance ?
Also do you know the type of index created on the column(i mean Hash type or
Btree etc)
On Mon, Aug 17, 2009 at 8:30 PM, Kirill Shabunov <[email protected]> wrote:
Hi!
As far as I understand you are talking about the secondary indexes. Yes,
they can be used to quickly get the rowkey by a value in the indexed column.
--Kirill
bharath vissapragada wrote:
Hi all ,
I have gone through the IndexedTableAdmin classes in Hbase 0.19.3 API ..
I
have seen some methods used to create an Indexed Table (on some column)..
I
have some doubts regarding the same ...
1) Are these somewhat similar to Hash indexes(in RDBMS) where i can easily
lookup a column value and find it's corresponding rowkey(s)
2) Can i find any performance gain when i use IndexedTable to search for a
paritcular column value .. instead of scanning an entire normal HTable ..
Kindly clarify my doubts
Thanks in advance