[ https://issues.apache.org/jira/browse/HDFS-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yiqun Lin updated HDFS-11044: ----------------------------- Affects Version/s: 3.0.0-alpha1 > TestRollingUpgrade fails intermittently > --------------------------------------- > > Key: HDFS-11044 > URL: https://issues.apache.org/jira/browse/HDFS-11044 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0-alpha1 > Reporter: Yiqun Lin > Assignee: Yiqun Lin > Attachments: HDFS-11044.001.patch > > > The test {{TestRollingUpgrade#testRollback}} fails intermittently in > trunk(https://builds.apache.org/job/PreCommit-HDFS-Build/17250/testReport/). > The stack info: > {code} > java.lang.AssertionError: Test resulted in an unexpected exit > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1949) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1936) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1929) > at > org.apache.hadoop.hdfs.TestRollingUpgrade.testRollback(TestRollingUpgrade.java:351) > {code} > I looked into that, it seems there is some IOException happenning in writing > files to nn storages(Can see jenkins report). And then this exception will be > remenbered in {{ExitUtil.firstExitException}}. Finally when we do the > cluster's shutdown operations, this exception will be threw. > The exception info: > {code} > 2016-10-21 12:54:02,300 [main] FATAL hdfs.MiniDFSCluster > (MiniDFSCluster.java:shutdown(1946)) - Test resulted in an unexpected exit > org.apache.hadoop.util.ExitUtil$ExitException: java.io.IOException: All the > storage failed while writing properties to VERSION file > at > org.apache.hadoop.hdfs.server.namenode.NNStorage.writeAll(NNStorage.java:1151) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.updateStorageVersion(FSImage.java:999) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:850) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:240) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:149) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:838) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:819) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > {code} > The IOException is beacause that all the sotrage dir have be removed. IMO, > one of the reason is that when we writing some properties or write > transactionId to storage failed that lead the existing sotrage to be removed. > In test {{TestRollingUpgrade#testRollback}} it will do many times for > restarting namenode operations, the underlying IO exceptions will be > happened. So I'm not sure if it's normal here. But one way that I am sure to > fix this: We can use {{checkExitOnShutdown(false)}} to skip the ExitException > check. And this have been done in > {{TestRollingUpgrade#testRollingUpgradeWithQJM}}. In addition, since that the > shutdown operation is the last operation in the test, it will not influence > the current logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org