[ 
https://issues.apache.org/jira/browse/HDFS-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508951#comment-13508951
 ] 

Kihwal Lee commented on HDFS-4251:
----------------------------------

We've another case that affects edit logging: The edit rolling was successful, 
but after downloading checkpoint fsimage from the secondary namenode, 
FileJournalManager#purgeLogsOlderThan() failed due to the system's opendir() 
failed. 

bq. The rpc layer needs to limit the number of tcp connections to be a little 
less than the max fds.

There are also multiple log files and connections through servlets beside RPC 
requests. Do we want NN to figure out the limit automatically? Or leave it to 
users set a config variable? Automatic setting may be a bit tricky to do in a 
platform independent way.

The following is a breakdown of open file descriptors of a running namenode 
excluding sockets.

1 gc log
2 /dev/random, /dev/urandom
2 in-use lock files
2 edit log files
3 stdin, stdout, stderr
3 nn log, auth log, audit log
9 epoll fd
18 pipes
157 jar files

Limiting RPC layer to not accept beyond configured limit will be 
straightforward. 
                
> NN connections can use up all fds leaving none for rolling journal files
> ------------------------------------------------------------------------
>
>                 Key: HDFS-4251
>                 URL: https://issues.apache.org/jira/browse/HDFS-4251
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Sanjay Radia
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to