[
https://issues.apache.org/jira/browse/HIVE-12250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971936#comment-14971936
]
Naveen Gangam commented on HIVE-12250:
--------------------------------------
According to
https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html
, each new instance of HTable that uses a new instance of the Configuration
object will create a new ZK connection. In the HiveHBaseStorageHandler,
HiveHBaseTableInputFormat and HiveHBaseTableOutputFormat, a new instance of
HTable is created each time.
{code}
@Override
public void setConf(Configuration conf) {
jobConf = conf;
hbaseConf = HBaseConfiguration.create(conf); // this clones the object
}
{code}
and in the preCreateTable
{code}
...
// ensure the table is online
htable = new HTable(hbaseConf, tableDesc.getName());
...
{code}
We cannot share the HiveConf instances because they are session specific. I
dont think we could change this code.
There are other potential causes in TableInputFormat
{code}
setHTable(new HTable(HBaseConfiguration.create(jobConf),
Bytes.toBytes(hbaseTableName)));
String hbaseColumnsMapping = jobConf.get(HBaseSerDe.HBASE_COLUMNS_MAPPING);
boolean doColumnRegexMatching =
jobConf.getBoolean(HBaseSerDe.HBASE_COLUMNS_REGEX_MATCHING, true);
if (hbaseColumnsMapping == null) {
//// Naveen we never close the connections associated with the HTable we
instantiated above.
throw new IOException(HBaseSerDe.HBASE_COLUMNS_MAPPING + " required for
HBase Table.");
}
ColumnMappings columnMappings = null;
try {
columnMappings = HBaseSerDe.parseColumnsMapping(hbaseColumnsMapping,
doColumnRegexMatching);
} catch (SerDeException e) {
//// Naveen we never close the connections associated with the HTable we
instantiated a few lines above.
throw new IOException(e);
}
...
InputSplit [] results = new InputSplit[splits.size()];
for (int i = 0; i < splits.size(); i++) {
results[i] = new HBaseSplit((TableSplit) splits.get(i), tablePaths[0]);
}
return results;
/// Naveen Method end without cleaning up the underlying connections.
}
> Zookeeper connection leaks in Hive's HBaseHandler.
> --------------------------------------------------
>
> Key: HIVE-12250
> URL: https://issues.apache.org/jira/browse/HIVE-12250
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 1.1.0
> Reporter: Naveen Gangam
> Assignee: Naveen Gangam
>
> HiveServer2 performance regresses severely due to what appears to be a leak
> in the ZooKeeper connections. lsof output on the HS2 process shows about 8000
> TCP connections to the ZK ensemble nodes.
> grep TCP lsof-hive-node11 | grep node11 | grep -E "node03|node04|node05" | wc
> -l
> 7866
> grep TCP lsof-hive-node11 | grep node11 | grep -E "node03" | wc -l
> 2615
> grep TCP lsof-hive-node11 | grep node11 | grep -E "node04" | wc -l
> 2622
> grep TCP lsof-hive-node11 | grep node11 | grep -E "node05" | wc -l
> 2629
> node11 - HMS node
> node03, node04 and node05 are the hosts for zookeeper ensemble.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)