adoroszlai commented on code in PR #3810:
URL: https://github.com/apache/ozone/pull/3810#discussion_r1027139834
##########
hadoop-ozone/dist/src/main/compose/ozonesecure/test.sh:
##########
@@ -27,7 +27,7 @@ export SECURITY_ENABLED=true
: ${OZONE_BUCKET_KEY_NAME:=key1}
-start_docker_env
+start_docker_env 5
Review Comment:
This causes intermittent, but frequent failure in the replication test.
https://github.com/apache/ozone/blob/85e7cd1867ec9000df798c74ab0f9cf153936a5d/hadoop-ozone/dist/src/main/compose/ozonesecure/test.sh#L57-L61
The test scales datanodes to 2, waits for container replica count = 2, then
scales to 3, and waits for replica count = 3.
With 3 initial datanodes, the test can assume all nodes have the container,
so scaling to 2 then 3 datanodes, the container count always matches
expectations, if replication works correctly.
Now with 5 initial datanodes, when the test scales datanodes to 2, the
container may have 0, 1 or 2 healthy replicas left, depending on where the
original 3 replicas were stored. And when datanodes are scaled to 3, the
container may have 1, 2 or 3 replicas.
Thus the test fails frequently, but not in 100% of runs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]