[
https://issues.apache.org/jira/browse/HDFS-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898691#action_12898691
]
jinglong.liujl commented on HDFS-1325:
--------------------------------------
Sorry for late reply.
>Inspite of this, if you feel that this is the way you want to solve the
>problem, please consider adding this check into the application layer code,
>instead of DFSClient.
I think the solution should not only close() file on application.
Absolutely, in application layer (like HBase), It must release resource
(like FSDataOutputStream, connection...). On the other hand, even if
application (hbase) has not release resource, hadoop which act as a cloud
platform should not out of service, it should release these resource and keep
provide service to other applications.
Thus, some final way should be existed to release idle resource. And this
patch is just this final way.
> You are right. But the one-new-thread per DFSInputStream solution will really
> kill the system. If we can fix it with a lightweight solution, we could do it.
You're right, I'll make some revision on it.
> DFSClient(DFSInputStream) release the persistent connection with datanode
> when no data have been read for a long time
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-1325
> URL: https://issues.apache.org/jira/browse/HDFS-1325
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs client
> Reporter: jinglong.liujl
> Fix For: 0.20.3
>
> Attachments: dfsclient.patch, toomanyconnction.patch
>
>
> When you use Hbase over hadoop. We found during scanning over a large table (
> which has many regions and each region has many store files), there're too
> many connections has been kept between regionserver (act as DFSClient) and
> datanode. Even if the store file has been complete to scanning, the
> connections can not be closed.
> In our cluster, too many extra connections cause too many system resource has
> been wasted, which cause system cpu on region server reach to a high level,
> then bring this region server down.
> After investigating, we found the number of active connection is very small,
> and the most connection is idle. We add a timeout checker thread into
> DFSClient, to close this connection.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.