DFSClient(DFSInputStream) release the persistent connection with datanode when
no data have been read for a long time
---------------------------------------------------------------------------------------------------------------------
Key: HDFS-1325
URL: https://issues.apache.org/jira/browse/HDFS-1325
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client
Reporter: jinglong.liujl
Fix For: 0.20.3
Attachments: dfsclient.patch
When you use Hbase over hadoop. We found during scanning over a large table (
which has many regions and each region has many store files), there're too many
connections has been kept between regionserver (act as DFSClient) and datanode.
Even if the store file has been complete to scanning, the connections can not
be closed.
In our cluster, too many extra connections cause too many system resource has
been wasted, which cause system cpu on region server reach to a high level,
then bring this region server down.
After investigating, we found the number of active connection is very small,
and the most connection is idle. We add a timeout checker thread into
DFSClient, to close this connection.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.