[ 
https://issues.apache.org/jira/browse/HDDS-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941686#comment-16941686
 ] 

Marton Elek commented on HDDS-2214:
-----------------------------------

@Sammi: Can you please review/confirm.

 

Note: I also updated some of the assertions to get better error message in the 
builds (use assertEquals instead of assertTrue)

> TestSCMContainerPlacementRackAware has an intermittent failure
> --------------------------------------------------------------
>
>                 Key: HDDS-2214
>                 URL: https://issues.apache.org/jira/browse/HDDS-2214
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Marton Elek
>            Assignee: Marton Elek
>            Priority: Major
>
> For example from the nightly build:
> {code:java}
>   <testcase name="testNoFallback[8]" 
> classname="org.apache.hadoop.hdds.scm.container.placement.algorithms.TestSCMContainerPlacementRackAware"
>  time="0.014">
>       
>       
>             <failure type="java.lang.AssertionError">java.lang.AssertionError
>    
>       
>               at org.junit.Assert.fail(Assert.java:86)
>       
>       
>               at org.junit.Assert.assertTrue(Assert.java:41)
>       
>       
>               at org.junit.Assert.assertTrue(Assert.java:52)
>       
>       
>               at 
> org.apache.hadoop.hdds.scm.container.placement.algorithms.TestSCMContainerPlacementRackAware.testNoFallback(TestSCMContainerPlacementRackAware.java:276)
>       
>       
>               at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       
>       
>               at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       
>       
>               at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       
>       
>               at java.lang.reflect.Method.invoke(Method.java:498)
>       
>       
>               at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  {code}
> The problem is in the testNoFallback:
> Let's say we have 11 nodes (from parameter) and we would like to choose 5 
> nodes (hard coded in the test).
> As the first two replicas are chosen from the same rack an all the other from 
> different racks it's not possible, so we except a failure.
> But we have an assertion that the success count is at least 3. But this is 
> true only if the first two replicas are placed to the rack1 (5 nodes) or 
> rack2 (5nodes). If the replica is placed to the rack3 (one node) it will fail 
> immediately:
>  
> Lucky case when we have success count > 3
> {code:java}
>  rack1 -- node1 
>  rack1 -- node2 -- FIRST replica
>  rack1 -- node3 -- SECOND replica
>  rack1 -- node4
>  rack1 -- node5 
>  rack2 -- node6
>  rack2 -- node7 -- THIRD replica
>  rack2 -- node8
>  rack2 -- node9 
>  rack2 -- node10
>  rack3 -- node11 -- FOURTH replica{code}
>  The specific case when we have success count == 1, as we can't choose the 
> second replica on rack3 (This is when the test is failing)
> {code:java}
>  rack1 -- node1 
>  rack1 -- node2
>  rack1 -- node3
>  rack1 -- node4
>  rack1 -- node5 
>  rack2 -- node6
>  rack2 -- node7
>  rack2 -- node8
>  rack2 -- node9 
>  rack2 -- node10
>  rack3 -- node11 -- FIRST replica{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to