[
https://issues.apache.org/jira/browse/HDFS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505894#comment-13505894
]
Kihwal Lee commented on HDFS-4233:
----------------------------------
The NN log shows following. The underlying FileJournalManager reported
FileNotFoundException when the namenode process ran out of file descriptors.
When the previous editlog files were closed, they were quickly taken away by
socket connections, etc.
{panel}
2012-xx-yy 00:00:00,000 [IPC Server handler 00 on 0000] INFO
org.apache.hadoop.ipc.Server: IPC Server handler 00 on 00000, call:
rollEditLog(), rpc version=1, client version=6, methodsFingerPrint=403308677
from 1.1.1.1:12345, error:
java.io.IOException: Unable to start log segment 12345678: no journals
successfully started.
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:840)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:802)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:911)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:3494)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:644)
at sun.reflect.GeneratedMethodAccessor111.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:394)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1530)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1526)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1524)
{panel}
> NN keeps serving even after no journals started while rolling edit
> ------------------------------------------------------------------
>
> Key: HDFS-4233
> URL: https://issues.apache.org/jira/browse/HDFS-4233
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.23.5
> Reporter: Kihwal Lee
> Priority: Critical
>
> We've seen namenode keeps serving even after rollEditLog() failure. Instead
> of taking a corrective action or regard this condition as FATAL, it keeps on
> serving and modifying its file system state. No logs are written from this
> point, so if the namenode is restarted, there will be data loss.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira