[ 
https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189043#comment-17189043
 ] 

Robert Metzger commented on FLINK-19064:
----------------------------------------

In my understanding of the {{InputFormat}} interface, the open() and close() 
method have a different lifecycle than the createInputSplits() method.

If in the HBase case createInputSplits() needs a connection, then this method 
needs to establish and close the connection in this method call.
Afaik the createInputSplits() method is called on the JobManager once during 
initialization.


The open() and close() methods are opened and closed for each split on the 
TaskManagers.

Thus we can not open the connection once and use it for all purposes. We need 
to establish a connection on the master, and then on the TaskManagers as well.

DataSourceNode#computeOperatorSpecificDefaultEstimates() is irrelevant for the 
Hbase format, since the getStatistics() call always returns null.

> HBaseRowDataInputFormat is leaking resources
> --------------------------------------------
>
>                 Key: FLINK-19064
>                 URL: https://issues.apache.org/jira/browse/FLINK-19064
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / HBase
>    Affects Versions: 1.12.0
>            Reporter: Robert Metzger
>            Assignee: Nicholas Jiang
>            Priority: Critical
>              Labels: pull-request-available
>
> {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, 
> which creates a connection to HBase that is not closed again.
> A user reported this problem on the user@ list: 
> https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to