Hi,

I have to spend a lot of time to look into the source code of HTable, 
HConnectionManager. 
IMHO, it seems the document on hbase website is misleading. In the hbase online 
document : http://hbase.apache.org/book.html#architecture.client . It mentioned:
==============================================================
For example, this is preferred:

HBaseConfiguration conf = HBaseConfiguration.create();
HTable table1 = new HTable(conf, "myTable");
HTable table2 = new HTable(conf, "myTable");

as opposed to this:

HBaseConfiguration conf1 = HBaseConfiguration.create();
HTable table1 = new HTable(conf1, "myTable");
HBaseConfiguration conf2 = HBaseConfiguration.create();
HTable table2 = new HTable(conf2, "myTable");
===============================================================
After I checking the src code , it seems only in 0.20 code, HTable must use the 
same Configuration instance in order to share the HConnection. 0.20 uses the 
configuration instance as the key for a hashmap to save HConnections. I check 
0.90.0 code, it already use HConnectionKey as the key of the HashMap which save 
the shared HConnections. 

So as far as I understand, the document is NOT true for HBase later than 0.90 
version. These two examples can both share HConnection instance. If I am wrong, 
please correct me.  

For my previous question. If two HTable already share the HConnection, why I 
need to create a HConnection first by HConnectionManager.createConnection()?
By reading the src code, it seems the HTable.close() will also close the 
HConnection, so one table do a close, the following HTable have to reconnect, 
no shareing. But if the HTable is initiated by the HConnection.getTable(), it 
will use a special constructor of HTable to make sure when HTable.close() is 
invoked, it will NOT close the connection. So the HConnection can be shared.

I will use the recommended method, and as discussed in another thread here, to 
share HConnection one still have to ensure the shared connection should not be 
closed. So the HConnectionManager is a good abstraction to control the life 
cycle of a connection. I seem to understand now :-) 

Thanks,
Ming


-----Original Message-----
From: Liu, Ming (HPIT-GADSC) 
Sent: Saturday, February 14, 2015 10:45 PM
To: [email protected]
Subject: HTable or HConnectionManager, how a client connect to HBase?

Hi,

I am using HBase 0.98.6.

I learned from this maillist before, that the recommended method to 'connect' 
to HBase from client is to use HConnectionManager like this:
                                HConnection 
con=HConnectionManager.createConnection(configuration);
                                HTableInterfacetable = 
con.getTable("hbase_table1"); Instead of
                                HTableInterface table = new 
HTable(configuration, "hbase_table1");

I don't quite understand the reason. I was thinking that each time I initialize 
a HTable instance, it needs to create a new HConnection. And that is expensive. 
But using the first method, multiple HTable instances can share the same 
HConnection. That is quite reasonable to me.
However, I was reading from some articles on internet that , even if I use the 
'new HTable(conf, tbl)' method, if the 'conf' object is the same one, all the 
HTable instances will still share the same HConnection. I was recently read yet 
another article and said when using 'new HTable(conf, tbl)', one don't need to 
use the exactly same 'conf' object (same one in memory). if two 'conf' objects, 
two different objects are all the same, I mean all attributes of these two are 
same (for example, created from the same hbase-site.xml and never change) then 
HTable objects can still share the same HConnection.  I also try to read the 
HTable src code, it is very hard, but it seems to me the last statement is 
correct: 'HTable will share HConnection, if configuration is all the same'.

Sorry for so verbose. My question:
If two 'configuration' objects are same, then two HTable object instantiated 
with them respectively can still share the same HConnection or not? Directly 
using the 'new HTable()' method.
If the answer is 'yes', then why I still need the HConnectionManager to create 
a shared connection?
I am talking about 0.98.6.
I googled for days, and even try to read HBase src code, but still get really 
confused. I try to do some tests also, but since I am too newbie, I don't know 
how to verify the difference, I really don't know what a HConnection do under 
the hood. I counted the ZooKeeper client requests, and I found some difference. 
If this ZooKeeper requests difference is a correct metrics, it means to me that 
two HTable do not share HConnetion even using same 'configuration' in the 
constructor. So it confused me more and more....

Please someone kindly help me for this newbie question and thanks in advance.

Thanks,
Ming


Reply via email to