[ 
https://issues.apache.org/jira/browse/HDFS-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164214#comment-16164214
 ] 

Weiwei Yang edited comment on HDFS-12440 at 9/13/17 7:13 AM:
-------------------------------------------------------------

It looks like all DNs were registered

{noformat}
2017-09-12 12:08:02,726 [main] INFO  ozone.MiniOzoneCluster 
(MiniOzoneCluster.java:lambda$waitOzoneReady$0(259))      - Waiting for cluster 
to be ready. Got 0 of 3 DN Heartbeats.
2017-09-12 12:08:03,726 [main] INFO  ozone.MiniOzoneCluster 
(MiniOzoneCluster.java:lambda$waitOzoneReady$0(259))      - Waiting for cluster 
to be ready. Got 0 of 3 DN Heartbeats.
2017-09-12 12:08:04,326 [IPC Server handler 18 on 37181] INFO  
node.SCMNodeManager (SCMNodeManager.java:register(745))      - Data node with 
ID: c5906234-d717-45d9-bbe8-972bc4dad260 Registered.
2017-09-12 12:08:04,327 [IPC Server handler 10 on 37181] INFO  
node.SCMNodeManager (SCMNodeManager.java:register(745))      - Data node with 
ID: 2541b3ac-d953-4650-8214-26aa6fd8601e Registered.
2017-09-12 12:08:04,335 [IPC Server handler 11 on 37181] INFO  
node.SCMNodeManager (SCMNodeManager.java:register(745))      - Data node with 
ID: 5dcadc32-d543-4044-9b58-892eeb6880bf Registered.
2017-09-12 12:08:04,727 [main] INFO  ozone.MiniOzoneCluster 
(MiniOzoneCluster.java:lambda$waitOzoneReady$0(259))      - Cluster is ready. 
Got 3 of 3 DN Heartbeats.
{noformat}

however SCM node manager seems not properly initiated

{noformat}
org.apache.hadoop.ozone.protocol.StorageContainerLocationProtocol.allocateContainer
 from 172.17.0.2:53783
java.lang.NullPointerException
        at 
org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
        at 
org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
        at 
org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
{noformat}


was (Author: cheersyang):
It looks like the UT was failing because dfs test dir gets conflicts, I saw 
following error message in the log

{noformat}
[main] INFO  ozone.MiniOzoneCluster (MiniOzoneCluster.java:setConf(125))      - 
dn2: set dfs.container.ratis.datanode.storage.dir = 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data-1
...
[main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:startDataNodes(1596)) - 
Starting DataNode 2 with dfs.datanode.data.dir: 
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data0,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data1
{noformat}

then ...

{noformat}
2017-09-11 10:54:32,297 [Thread-176] INFO  common.Storage 
(Storage.java:lock(813)) - Cannot lock storage 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1.
 The directory is already locked
2017-09-11 10:54:32,301 [Thread-176] WARN  common.Storage 
(DataStorage.java:loadDataStorage(410)) - Failed to add storage directory 
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1
java.io.IOException: Cannot lock storage 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1.
 The directory is already locked
        at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:814)
        at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:622)
        at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:262)
        at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:399)
        at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:379)
        at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:544)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1731)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1691)
        at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:376)
        at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
        at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
        at java.lang.Thread.run(Thread.java:748)
{noformat}

> Ozone: TestAllocateContainer fails on jenkins
> ---------------------------------------------
>
>                 Key: HDFS-12440
>                 URL: https://issues.apache.org/jira/browse/HDFS-12440
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Minor
>
> I am seeing this failure in [this jenkins 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21089/testReport/org.apache.hadoop.ozone.scm/TestAllocateContainer/testAllocate/],
>  with following error
> {noformat}
> Stacktrace
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
>  at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
>  at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>  at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>  at 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to