[
https://issues.apache.org/jira/browse/HDFS-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164214#comment-16164214
]
Weiwei Yang edited comment on HDFS-12440 at 9/13/17 7:13 AM:
-------------------------------------------------------------
It looks like all DNs were registered
{noformat}
2017-09-12 12:08:02,726 [main] INFO ozone.MiniOzoneCluster
(MiniOzoneCluster.java:lambda$waitOzoneReady$0(259)) - Waiting for cluster
to be ready. Got 0 of 3 DN Heartbeats.
2017-09-12 12:08:03,726 [main] INFO ozone.MiniOzoneCluster
(MiniOzoneCluster.java:lambda$waitOzoneReady$0(259)) - Waiting for cluster
to be ready. Got 0 of 3 DN Heartbeats.
2017-09-12 12:08:04,326 [IPC Server handler 18 on 37181] INFO
node.SCMNodeManager (SCMNodeManager.java:register(745)) - Data node with
ID: c5906234-d717-45d9-bbe8-972bc4dad260 Registered.
2017-09-12 12:08:04,327 [IPC Server handler 10 on 37181] INFO
node.SCMNodeManager (SCMNodeManager.java:register(745)) - Data node with
ID: 2541b3ac-d953-4650-8214-26aa6fd8601e Registered.
2017-09-12 12:08:04,335 [IPC Server handler 11 on 37181] INFO
node.SCMNodeManager (SCMNodeManager.java:register(745)) - Data node with
ID: 5dcadc32-d543-4044-9b58-892eeb6880bf Registered.
2017-09-12 12:08:04,727 [main] INFO ozone.MiniOzoneCluster
(MiniOzoneCluster.java:lambda$waitOzoneReady$0(259)) - Cluster is ready.
Got 3 of 3 DN Heartbeats.
{noformat}
however SCM node manager seems not properly initiated
{noformat}
org.apache.hadoop.ozone.protocol.StorageContainerLocationProtocol.allocateContainer
from 172.17.0.2:53783
java.lang.NullPointerException
at
org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
at
org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
at
org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
{noformat}
was (Author: cheersyang):
It looks like the UT was failing because dfs test dir gets conflicts, I saw
following error message in the log
{noformat}
[main] INFO ozone.MiniOzoneCluster (MiniOzoneCluster.java:setConf(125)) -
dn2: set dfs.container.ratis.datanode.storage.dir =
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data-1
...
[main] INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:startDataNodes(1596)) -
Starting DataNode 2 with dfs.datanode.data.dir:
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data0,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data1
{noformat}
then ...
{noformat}
2017-09-11 10:54:32,297 [Thread-176] INFO common.Storage
(Storage.java:lock(813)) - Cannot lock storage
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1.
The directory is already locked
2017-09-11 10:54:32,301 [Thread-176] WARN common.Storage
(DataStorage.java:loadDataStorage(410)) - Failed to add storage directory
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1
java.io.IOException: Cannot lock storage
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1.
The directory is already locked
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:814)
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:622)
at
org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:262)
at
org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:399)
at
org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:379)
at
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:544)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1731)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1691)
at
org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:376)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:748)
{noformat}
> Ozone: TestAllocateContainer fails on jenkins
> ---------------------------------------------
>
> Key: HDFS-12440
> URL: https://issues.apache.org/jira/browse/HDFS-12440
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Affects Versions: HDFS-7240
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Priority: Minor
>
> I am seeing this failure in [this jenkins
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21089/testReport/org.apache.hadoop.ozone.scm/TestAllocateContainer/testAllocate/],
> with following error
> {noformat}
> Stacktrace
> java.lang.NullPointerException
> at
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
> at
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
> at
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
> at
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]