[
https://issues.apache.org/jira/browse/HADOOP-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695642#action_12695642
]
Konstantin Shvachko commented on HADOOP-5573:
---------------------------------------------
The first two bugs (NPE) are fixed by HADOOP-5119.
The story here is that {{testBackupRegistration()}} starts two backup nodes one
ofter another. The first one keeps making chackpoints. But the second is just
initializing. During initialization it creates new {{FSNamesystem}} class,
which in the beginning sets the static variable {{fsNamesystemObject}} to null.
It takes time to initialize the BackupNode until it will set
{{fsNamesystemObject = this}}.
In the meantime the first backup node start a checkpoint, which accesses
{{FSNamesystem}} via {{fsNamesystemObject}}. Since it is static it contains the
value the second node assigned it, which is null at that moment. Therefore
different NPEs depending on the timing of the checkpoint.
We should not see that again, since HADOOP-5119 eliminated
{{fsNamesystemObject}}.
Third error is also gone, because {{processIOError()}} was recently changed by
HADOOP-4045.
But I am still looking at it. I am getting some strange asserts there.
> TestBackupNode sometimes fails
> ------------------------------
>
> Key: HADOOP-5573
> URL: https://issues.apache.org/jira/browse/HADOOP-5573
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: Tsz Wo (Nicholas), SZE
>
> TestBackupNode may fail with different reasons:
> - Unable to open edit log file
> .\build\test\data\dfs\name-backup1\current\edits (FSEditLog.java:open(371))
> - NullPointerException at
> org.apache.hadoop.hdfs.server.namenode.EditLogBackupOutputStream.flushAndSync(EditLogBackupOutputStream.java:163)
> - Fatal Error : All storage directories are inaccessible.
> Will provide more information in the comments.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.