I may have misunderstood your point.

val result = rdd.map(line => {
  val table = Util.Connection.getTable("user")
  ...
  table.close()
}

Did you mean this is enough, and there’s no need to call
Util.Connection.close(),
or HConnectionManager.deleteAllConnections()?

Where is the documentation that statesHconnectionManager would release
underlying connection automatically?
If that’s true, maybe the Javadoc which recommends a shutdown hook needs
update
​

2014-10-16 14:20 GMT+08:00 Fengyun RAO <[email protected]>:

> Thanks, Ted.
> Util.Connection.close() should be called only once, so it can NOT be in a
> map function
>
> val result = rdd.map(line => {
>   val table = Util.Connection.getTable("user")
>   ...
>   Util.Connection.close()
> }
>
> As you mentioned:
>
> Calling table.close() is the recommended approach.
> HConnectionManager does reference counting. When all references to the
> underlying connection are gone, connection would be released.
>
> Yes, we should call table.close(), but it won’t remove HConnection in
> HConnectionManager which is a HConnection pool.
> As I look into the HconnectionManager Javadoc, it seems I have to
> implement a shutdown hook
>
>  * <p>Cleanup used to be done inside in a shutdown hook.  On startup we'd
>  * register a shutdown hook that called {@link #deleteAllConnections()}
>  * on its way out but the order in which shutdown hooks run is not defined so
>  * were problematic for clients of HConnection that wanted to register their
>  * own shutdown hooks so we removed ours though this shifts the onus for
>  * cleanup to the client.
>
> ​
>
> 2014-10-15 22:31 GMT+08:00 Ted Yu <[email protected]>:
>
>> Pardon me - there was typo in previous email.
>>
>> Calling table.close() is the recommended approach.
>> HConnectionManager does reference counting. When all references to the
>> underlying connection are gone, connection would be released.
>>
>> Cheers
>>
>> On Wed, Oct 15, 2014 at 7:13 AM, Ted Yu <[email protected]> wrote:
>>
>>> Have you tried the following ?
>>>
>>> val result = rdd.map(line => { val table = Util.Connection.getTable("user")
>>> ...
>>> Util.Connection.close() }
>>>
>>> On Wed, Oct 15, 2014 at 6:09 AM, Fengyun RAO <[email protected]>
>>> wrote:
>>>
>>>> In order to share an HBase connection pool, we create an object
>>>>
>>>> Object Util {
>>>>     val HBaseConf = HBaseConfiguration.create
>>>>     val Connection= HConnectionManager.createConnection(HBaseConf)
>>>> }
>>>>
>>>> which would be shared among tasks on the same executor. e.g.
>>>>
>>>> val result = rdd.map(line => {
>>>>   val table = Util.Connection.getTable("user")
>>>>   ...
>>>> }
>>>>
>>>> However, we don’t how to close the Util.Connection.
>>>> If we write Util.Connection.close() in the main function,
>>>> it’ll only run on the driver, not the executor.
>>>>
>>>> So, How to make sure every Connection closed before exist?
>>>> ​
>>>>
>>>
>>>
>>
>

Reply via email to