Sorry Slava, 'RPC Lock' was misinformation on our part. Subsequent
digging turned up the fact that RPC has a pool of Connections, one to
each remote host. Send and receive on this single connection is
synchronous but otherwise, Connection is idle. Primitive testing had it
that there is a benefit to having multiple HTable instances. As to how
much, we have yet to ascertain and at high numbers of HTable, there'd
start to be contention over the single Connection (or if amount of data
being passed was large).
We'll post when we have a better story than the above,
Good stuff,
St.Ack
Slava Gorelik wrote:
As far as i know the HTable itself has connection pool (HConnectionManager
is singleton).I think, multiple instances of HTable within same application
will not help you.
You better try to use multiple process instead of multiple threads.
You can search the mailing list archive, i asked almost same question.
Current HBase client implementation has some RPC Lock, i.e. multi-threading
is not useful.
Best Regards.
On Mon, Dec 15, 2008 at 12:23 PM, Michael Dagaev
<[email protected]>wrote:
Hi, all
Currently, we are using a single instance of HTable in a
multithreaded application. That is, several threads use the same
instance of HTable to insert data in the database. Since method
"commit" of HTable is synchronized, we are afraid that the single
instance of HTable can be a bottle neck. So, we are going to create a
pool of HTable instances (all instances are created with the same
table name) and use the instances simultaneously (an instance per
thread).
Does it make sense?
Thank you for your cooperation,
M.