[
https://issues.apache.org/jira/browse/HDDS-2365?focusedWorklogId=333903&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333903
]
ASF GitHub Bot logged work on HDDS-2365:
----------------------------------------
Author: ASF GitHub Bot
Created on: 25/Oct/19 05:59
Start Date: 25/Oct/19 05:59
Worklog Time Spent: 10m
Work Description: adoroszlai commented on pull request #84: HDDS-2365.
Fix TestRatisPipelineProvider#testCreatePipelinesDnExclude
URL: https://github.com/apache/hadoop-ozone/pull/84
## What changes were proposed in this pull request?
Fix TestRatisPipelineProvider#testCreatePipelinesDnExclude, which has been
failing intermittently.
The test:
1. creates 12 fake datanodes (8 healthy, 3 unhealthy, 1 healthy)
2. "manually" creates 3 pipelines (2 open, 1 closed) from the first 9 nodes
(as per `getAllNodes()`)
3. tests that when `RatisPipelineProvider` is asked for another pipeline,
it must select at least 1 node from the closed pipeline (it cannot select the 6
nodes used for open pipelines and it cannot select the 3 unhealthy nodes).
The problem is that `getAllNodes()` returns nodes in ["random"
order](https://github.com/apache/hadoop-ozone/blob/d7e2fb19759b7d447bc92558f2ab5504e047f5db/hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java#L194-L195),
so the manually created pipelines may contain unhealthy nodes, too. It may
happen that all 3 unhealthy nodes are used for the existing pipelines and 3
healthy ones are left free.
Changes to fix the test:
1. use the same node count (8+2 = 10) as all other tests in the class
2. only create 1 open and 1 closed "manual" pipeline
3. use only healthy nodes for "manual" pipelines, this leaves 2 healthy and
2 unhealthy nodes
Misc changes:
* cleanup (extract some common logic, static imports, etc.)
* `TestRatisPipelineProvider` does not use MiniOzoneCluster or other
integration-test features, and does not take a long time, so it can be moved to
real unit tests
https://issues.apache.org/jira/browse/HDDS-2365
## How was this patch tested?
Ran the test 100+ times without failures.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 333903)
Remaining Estimate: 0h
Time Spent: 10m
> TestRatisPipelineProvider#testCreatePipelinesDnExclude is flaky
> ---------------------------------------------------------------
>
> Key: HDDS-2365
> URL: https://issues.apache.org/jira/browse/HDDS-2365
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: test
> Reporter: Attila Doroszlai
> Assignee: Attila Doroszlai
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> TestRatisPipelineProvider#testCreatePipelinesDnExclude is flaky, failing in
> CI intermittently:
> *
> https://github.com/elek/ozone-ci-03/blob/master/pr/pr-hdds-2360-9pxww/integration/hadoop-ozone/integration-test/org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider.txt
> *
> https://github.com/elek/ozone-ci-03/blob/master/pr/pr-hdds-2352-cxhw9/integration/hadoop-ozone/integration-test/org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider.txt
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]