[ 
https://issues.apache.org/jira/browse/HDDS-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-5956:
-----------------------------
    Attachment: HDDS-5956.001.patch

> Speed up TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplia
> ----------------------------------------------------------------------------
>
>                 Key: HDDS-5956
>                 URL: https://issues.apache.org/jira/browse/HDDS-5956
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: test
>            Reporter: Siyao Meng
>            Assignee: Siyao Meng
>            Priority: Major
>         Attachments: HDDS-5956.001.patch
>
>
> When working on HDDS-5891, I found that 
> {{TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplia}} is 
> markably slow.
> And this is abstract test class is extended by three test classes, with one 
> of test explicitly disabling this test case.
> For instance, for me locally, the entire {{TestOzoneRpcClient}} took 2 min 15 
> sec to run, {{testZReadKeyWithUnhealthyContainerReplia}} alone took 2 min 9 
> sec. I assume it would take even longer to finish in GitHub Actions machines. 
> Other 70+ test cases in this class mostly took tens of milliseconds to finish 
> each.
> In a 2 min 9 sec run, ~90 seconds are spent on waiting for DN to be stopped:
> {code:java}
> 2021-11-09 18:03:57,217 [Time-limited test] INFO  ozone.MiniOzoneClusterImpl 
> (MiniOzoneClusterImpl.java:lambda$waitForHddsDatanodesStop$3(389)) - Waiting 
> on 3 datanodes out of 2 to be marked unhealthy.
> ...
> 2021-11-09 18:05:28,622 [Time-limited test] INFO  ozone.MiniOzoneClusterImpl 
> (MiniOzoneClusterImpl.java:lambda$waitForHddsDatanodesStop$3(389)) - Waiting 
> on 3 datanodes out of 2 to be marked unhealthy.
> {code}
> 1. Setting "ozone.scm.stale.node.interval" to 10s (TestReconTasks also did 
> this) for the test alone reduced run time from 69s to 39s, saving 60s 
> (x2=120s for both test classes).
> 2. Moving the extra 5000ms sleep length into {{GenericTestUtils.waitFor()}} 
> saved another 5s
> 3. Fix the typo in this test case method name. :)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to