Hi Nick,
I am using HBase version 0.96, I sent the link from version 0.94 because I
haven't found the java API docs for 0.96, sorry about that.
I have created the HTable directly from the config object, as follows:
this.tlConfig = new ThreadLocal<Configuration>() {
@Override
protected Configuration initialValue() {
return HBaseConfiguration.create();
}
};
this.tlTable = new ThreadLocal<HTable>() {
@Override
protected HTable initialValue() {
try {
return new HTable(tlConfig.get(), "HBaseSerialWritesPOC");
} catch (IOException e) {
throw new RuntimeException(e);
}
}
};
I am now sure if the Configuration object should be 1 per thread as well, maybe
I could share this one?
So, just to clarify, would I get any advantage using HTablePool object instead
of ThreadLocal<HTable> as I did?
-Marcelo
From: [email protected]
Subject: Re: HBase connection pool
Hi Marcelo,
First thing, to be clear, you're working with a 0.94 release? The reason I ask
is we've been doing some work in this area to improve things, so semantics may
be slightly different between 0.94, 0.98, and 1.0.
How are you managing the HConnection object (or are you)? How are you creating
your HTable instances? These will determine how the connection is obtained and
used in relation to HTables.
In general, multiple HTable instances connected to tables in the same cluster
should be sharing the same HConnection instance. This is handled explicitly
when you manage your own HConnection and HTables (i.e., HConnection conn = ...
; HTable t = new HTable(TABLE_NAME, conn); ) It's handled implicitly when you
construct via Configuration objects (HTable t = new HTable(conf, TABLE_NAME); )
This implicit option is going away in future versions.
HTable is not safe for concurrent access because of how the write path is
implemented (at least; there may be other portions that I'm not as familiar
with). You should be perfectly fine to have an HTable per thread in a
ThreadLocal.
-n
On Wed, Feb 25, 2015 at 9:41 AM, Marcelo Valle (BLOOMBERG/ LONDON)
<[email protected]> wrote:
In HBase API, does 1 HTable object means 1 connection to each region server
(just for 1 table)?
The docs say
(http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html):
"This class is not thread safe for reads nor write."
I got confused, as I saw there is a HTablePool class, but it's only for a table
as well, can't connections be reused for more than 1 table?
In my java application, I used ThreadLocal variables (ThreadLocal<HTable>) to
create an HTable variable per thread. If I do several operations on each
thread, I should still use the same connection, right?
[]s