[
https://issues.apache.org/jira/browse/HDFS-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509135#comment-13509135
]
Colin Patrick McCabe commented on HDFS-4251:
--------------------------------------------
This is a good find, Sanjay.
One problem with trying to do connection throttling automatically is that (at
least on Linux) there is both a global number of file descriptors, and a
process-specific number of file descriptors. Since we can't know what else is
running on the system, we can't really tailor ourselves to respect the global
number of FDs.
Even if we only consider the per-process limit, there are difficult race
conditions involved. For example, if there are 3 per-process fds left, the
connection throttler can try to limit itself to 3 more connections-- but then,
if another thread jumps in and opens a file for some reason-- you are unable to
open the last socket. Consequently, if we did do this, we'd have to have a
high "slop factor."
I'm also a little bit afraid that if we automatically throttle ourselves in the
case of a low per-process FD limit, this will result in users not realizing why
their performance has degraded. At the very least, if we do implement such
throttling, we should prominently log an ERROR message when throttling kicks
in. I have a feeling that doing this kind of thing will hurt us significantly
in benchmarks. I also don't see a clear benefit, since running with a low
number of max fds is almost always a misconfiguration.
So for these reasons, I would advocate a manually configured connection
throttle. The default should be either no limit or something very large.
> NN connections can use up all fds leaving none for rolling journal files
> ------------------------------------------------------------------------
>
> Key: HDFS-4251
> URL: https://issues.apache.org/jira/browse/HDFS-4251
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Sanjay Radia
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira