[
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716545#comment-14716545
]
kanaka kumar avvaru commented on HDFS-8973:
-------------------------------------------
AFAIK based on 2.4.1 code base, this IOException is caught in {{Server.java}}
and logged. Hence this exception is not the reason for process exit.
As you can see GC messages in the .out file, GC might have caused the process
exit. Please confirm from the GC logs in .out file.
If true, you check the Memory & GC configurations and increase NN memory
configurations accordingly.
> NameNode exit without any exception log
> ---------------------------------------
>
> Key: HDFS-8973
> URL: https://issues.apache.org/jira/browse/HDFS-8973
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.4.1
> Reporter: He Xiaoqiao
> Priority: Critical
>
> namenode process exit without any useful WARN/ERROR log, and after .log file
> output interrupt .out file continue show about 5 min GC log. when .log file
> intertupt .out file print the follow ERROR, it may hint some info. it seems
> cause by log4j ERROR.
> {code:title=namenode.out|borderStyle=solid}
> log4j:ERROR Failed to flush writer,
> java.io.IOException: 错误的文件描述符
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:318)
> at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
> at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
> at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
> at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
> at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
> at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
> at
> org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
> at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
> at
> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
> at
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
> at org.apache.log4j.Category.callAppenders(Category.java:206)
> at org.apache.log4j.Category.forcedLog(Category.java:391)
> at org.apache.log4j.Category.log(Category.java:856)
> at
> org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
> at
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
> at
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)