Of course, I could create a connection in

val result = rdd.map(line => {
  val conf = HBaseConfiguration.create
  val connection = HConnectionManager.createConnection(conf)
  val table = connection.getTable("user")
  ...
  table.close()
  connection.close()
}

but that would be too slow, which is also the reason I share conf and
connection in Utilobject.

Maybe I did need a shutdown hook as the Javadoc says.

Thank you!

2014-10-17 12:18 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:

> Looking at Apache 0.98 code, you can follow the example in the class
> javadoc (line 144 of HConnectionManager.java):
>
>  * HTableInterface table = connection.getTable("table1");
>  * try {
>  *   // Use the table as needed, for a single operation and a single thread
>  * } finally {
>  *   table.close();
>  *   connection.close();
>  * }
>
> Cheers
>
> On Thu, Oct 16, 2014 at 9:03 PM, Fengyun RAO <raofeng...@gmail.com> wrote:
>
>> Thanks, Ted,
>>
>> We use CDH 5.1 and the HBase version is 0.98.1-cdh5.1.0, in which the
>> javadoc of HConnectionManager.java still recommends shutdown hook.
>>
>> I look into val table = Util.Connection.getTable("user"), and find it
>> didn't invoke
>>
>> public HTable(Configuration conf, final byte[] tableName, final 
>> ExecutorService pool)
>>
>> but
>>
>> public HTable(TableName tableName, final HConnection connection,
>>       final ExecutorService pool) throws IOException {
>>     if (connection == null || connection.isClosed()) {
>>       throw new IllegalArgumentException("Connection is null or closed.");
>>     }
>>     this.tableName = tableName;
>>     this.cleanupPoolOnClose = this.cleanupConnectionOnClose = false;
>>     this.connection = connection;
>>     this.configuration = connection.getConfiguration();
>>     this.pool = pool;
>>
>>     this.finishSetup();
>>   }
>>
>> in which cleanupConnectionOnClose is false
>>
>> 2014-10-16 22:51 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:
>>
>>> Which hbase release are you using ?
>>>
>>> Let me refer to 0.94 code hbase.
>>>
>>> Take a look at the following method
>>> in src/main/java/org/apache/hadoop/hbase/client/HTable.java :
>>>
>>>   public void close() throws IOException {
>>> ...
>>>     if (cleanupConnectionOnClose) {
>>>       if (this.connection != null) {
>>>         this.connection.close();
>>>
>>> When Connection.getTable() is called, the following is invoked:
>>>   public HTable(Configuration conf, final byte[] tableName, final
>>> ExecutorService pool)
>>> which sets cleanupConnectionOnClose to true.
>>>
>>> w.r.t. javadoc, the paragraph on shutdown hook is
>>> in HConnectionManager.java of 0.94
>>> You don't need to use shutdown hook for 0.94+
>>>
>>> Cheers
>>>
>>> On Wed, Oct 15, 2014 at 11:41 PM, Fengyun RAO <raofeng...@gmail.com>
>>> wrote:
>>>
>>>> I may have misunderstood your point.
>>>>
>>>> val result = rdd.map(line => {
>>>>   val table = Util.Connection.getTable("user")
>>>>   ...
>>>>   table.close()
>>>> }
>>>>
>>>> Did you mean this is enough, and there’s no need to call
>>>> Util.Connection.close(),
>>>> or HConnectionManager.deleteAllConnections()?
>>>>
>>>> Where is the documentation that statesHconnectionManager would release
>>>> underlying connection automatically?
>>>> If that’s true, maybe the Javadoc which recommends a shutdown hook
>>>> needs update
>>>> ​
>>>>
>>>> 2014-10-16 14:20 GMT+08:00 Fengyun RAO <raofeng...@gmail.com>:
>>>>
>>>>> Thanks, Ted.
>>>>> Util.Connection.close() should be called only once, so it can NOT be
>>>>> in a map function
>>>>>
>>>>> val result = rdd.map(line => {
>>>>>   val table = Util.Connection.getTable("user")
>>>>>   ...
>>>>>   Util.Connection.close()
>>>>> }
>>>>>
>>>>> As you mentioned:
>>>>>
>>>>> Calling table.close() is the recommended approach.
>>>>> HConnectionManager does reference counting. When all references to the
>>>>> underlying connection are gone, connection would be released.
>>>>>
>>>>> Yes, we should call table.close(), but it won’t remove HConnection in
>>>>> HConnectionManager which is a HConnection pool.
>>>>> As I look into the HconnectionManager Javadoc, it seems I have to
>>>>> implement a shutdown hook
>>>>>
>>>>>  * <p>Cleanup used to be done inside in a shutdown hook.  On startup we'd
>>>>>  * register a shutdown hook that called {@link #deleteAllConnections()}
>>>>>  * on its way out but the order in which shutdown hooks run is not 
>>>>> defined so
>>>>>  * were problematic for clients of HConnection that wanted to register 
>>>>> their
>>>>>  * own shutdown hooks so we removed ours though this shifts the onus for
>>>>>  * cleanup to the client.
>>>>>
>>>>> ​
>>>>>
>>>>> 2014-10-15 22:31 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:
>>>>>
>>>>>> Pardon me - there was typo in previous email.
>>>>>>
>>>>>> Calling table.close() is the recommended approach.
>>>>>> HConnectionManager does reference counting. When all references to
>>>>>> the underlying connection are gone, connection would be released.
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> On Wed, Oct 15, 2014 at 7:13 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>
>>>>>>> Have you tried the following ?
>>>>>>>
>>>>>>> val result = rdd.map(line => { val table = 
>>>>>>> Util.Connection.getTable("user")
>>>>>>> ...
>>>>>>> Util.Connection.close() }
>>>>>>>
>>>>>>> On Wed, Oct 15, 2014 at 6:09 AM, Fengyun RAO <raofeng...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> In order to share an HBase connection pool, we create an object
>>>>>>>>
>>>>>>>> Object Util {
>>>>>>>>     val HBaseConf = HBaseConfiguration.create
>>>>>>>>     val Connection= HConnectionManager.createConnection(HBaseConf)
>>>>>>>> }
>>>>>>>>
>>>>>>>> which would be shared among tasks on the same executor. e.g.
>>>>>>>>
>>>>>>>> val result = rdd.map(line => {
>>>>>>>>   val table = Util.Connection.getTable("user")
>>>>>>>>   ...
>>>>>>>> }
>>>>>>>>
>>>>>>>> However, we don’t how to close the Util.Connection.
>>>>>>>> If we write Util.Connection.close() in the main function,
>>>>>>>> it’ll only run on the driver, not the executor.
>>>>>>>>
>>>>>>>> So, How to make sure every Connection closed before exist?
>>>>>>>> ​
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to