hi all,
to better understand how hbase works i started reading this document
http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
and created some diagrams
here they are (png, and svg for editing):
1) habase hierarchy of objects:
http://www.starline.com.pl/hbase/habase_hierarchy.png
http://www.starline.com.pl/hbase/habase_hierarchy.svg
2) hbase architecture (relations between objects)
http://www.starline.com.pl/hbase/habase_architecture.png
http://www.starline.com.pl/hbase/habase_architecture.svg
3) visual representation flush cache operation
http://www.starline.com.pl/hbase/hbase_flush_cache.png
http://www.starline.com.pl/hbase/hbase_flush_cache.svg
since the documentation says that its information may be out of date
please feel free to comment on these diagrams, update them, put them on
your sites etc
i got a question too
lets say we have cluster of 3 machines:
- 1 master + region server,and
- 2 region servers
on each machine I got web server that connects to hbase client to get
and get information out from hbase
it is not clear to me where should these clients connect to
should all clients connect directly and only to the master, which will
tell them on which region server is the information they are looking for?
or can they connect to the region servers and if the information they
are looking for in not in them region servers will contact master and
fetch there information for the client?
krzysiek