[ 
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718325#comment-14718325
 ] 

kanaka kumar avvaru commented on HDFS-8973:
-------------------------------------------

Regarding logs not printed, looks like log4j produces [only first error in an 
appender  by default 
|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/helpers/OnlyOnceErrorHandler.html]
 and doesn't have recovery of the log file.
So, its recommended to configure {{FallbackErrorHandler}} or some other 
alternative method to ensure logs are not missed.

Regarding process Exit, we are missing something about the cause. Even after 
log4j error, system has functioned well for some time. The actual reason is may 
not be visible as logs are not present.

{quote}. it seems cause by log4j ERROR.{quote}
IMO we can't conclude this is the reason for process exit as NN looks 
functioning sometime after this message also.

> NameNode exit without any exception log
> ---------------------------------------
>
>                 Key: HDFS-8973
>                 URL: https://issues.apache.org/jira/browse/HDFS-8973
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: He Xiaoqiao
>            Priority: Critical
>
> namenode process exit without any useful WARN/ERROR log, and after .log file 
> output interrupt .out file continue show about 5 min GC log. when .log file 
> intertupt .out file print the follow ERROR, it may hint some info. it seems 
> cause by log4j ERROR.
> {code:title=namenode.out|borderStyle=solid}
> log4j:ERROR Failed to flush writer,
> java.io.IOException: 错误的文件描述符
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:318)
>         at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>         at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
>         at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
>         at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
>         at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
>         at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
>         at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
>         at 
> org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
>         at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
>         at 
> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
>         at 
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
>         at org.apache.log4j.Category.callAppenders(Category.java:206)
>         at org.apache.log4j.Category.forcedLog(Category.java:391)
>         at org.apache.log4j.Category.log(Category.java:856)
>         at 
> org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to