Thanks, Ted.
Util.Connection.close() should be called only once, so it can NOT be in a
map function

val result = rdd.map(line => {
  val table = Util.Connection.getTable("user")
  ...
  Util.Connection.close()
}

As you mentioned:

Calling table.close() is the recommended approach.
HConnectionManager does reference counting. When all references to the
underlying connection are gone, connection would be released.

Yes, we should call table.close(), but it won’t remove HConnection in
HConnectionManager which is a HConnection pool.
As I look into the HconnectionManager Javadoc, it seems I have to implement
a shutdown hook

 * <p>Cleanup used to be done inside in a shutdown hook.  On startup we'd
 * register a shutdown hook that called {@link #deleteAllConnections()}
 * on its way out but the order in which shutdown hooks run is not defined so
 * were problematic for clients of HConnection that wanted to register their
 * own shutdown hooks so we removed ours though this shifts the onus for
 * cleanup to the client.

​

2014-10-15 22:31 GMT+08:00 Ted Yu <[email protected]>:

> Pardon me - there was typo in previous email.
>
> Calling table.close() is the recommended approach.
> HConnectionManager does reference counting. When all references to the
> underlying connection are gone, connection would be released.
>
> Cheers
>
> On Wed, Oct 15, 2014 at 7:13 AM, Ted Yu <[email protected]> wrote:
>
>> Have you tried the following ?
>>
>> val result = rdd.map(line => { val table = Util.Connection.getTable("user")
>> ...
>> Util.Connection.close() }
>>
>> On Wed, Oct 15, 2014 at 6:09 AM, Fengyun RAO <[email protected]>
>> wrote:
>>
>>> In order to share an HBase connection pool, we create an object
>>>
>>> Object Util {
>>>     val HBaseConf = HBaseConfiguration.create
>>>     val Connection= HConnectionManager.createConnection(HBaseConf)
>>> }
>>>
>>> which would be shared among tasks on the same executor. e.g.
>>>
>>> val result = rdd.map(line => {
>>>   val table = Util.Connection.getTable("user")
>>>   ...
>>> }
>>>
>>> However, we don’t how to close the Util.Connection.
>>> If we write Util.Connection.close() in the main function,
>>> it’ll only run on the driver, not the executor.
>>>
>>> So, How to make sure every Connection closed before exist?
>>> ​
>>>
>>
>>
>

Reply via email to