[
https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903466#comment-16903466
]
Konstantin Shvachko commented on HDFS-14655:
--------------------------------------------
Hey guys, discussed this with Chen. It seems that we need a pool of only 3
threads, which can be reused for each iteration of tailing. Here 3 = number of
Journal Nodes. Creating threads with such high frequency seems to be expensive
in all aspects. How hard would it be to make this change?
> SBN : Namenode crashes if one of The JN is down
> -----------------------------------------------
>
> Key: HDFS-14655
> URL: https://issues.apache.org/jira/browse/HDFS-14655
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Harshakiran Reddy
> Assignee: Ayush Saxena
> Priority: Major
> Attachments: HDFS-14655.poc.patch
>
>
> {noformat}
> 2019-07-04 17:35:54,064 | INFO | Logger channel (from parallel executor) to
> XXXXXXX/XXXXXXX | Retrying connect to server: XXXXXXX/XXXXXXX. Already tried
> 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
> sleepTime=1000 MILLISECONDS) | Client.java:975
> 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered
> while tailing edits. Shutting down standby NN. | EditLogTailer.java:474
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
> at
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440)
> at
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56)
> at
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565)
> at
> org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272)
> at
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533)
> at
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508)
> at
> org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
> 2019-07-04 17:35:54,112 | INFO | Edit log tailer | Exiting with status 1:
> java.lang.OutOfMemoryError: unable to create new native thread |
> ExitUtil.java:210
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]