[ 
https://issues.apache.org/jira/browse/HDDS-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793734#comment-17793734
 ] 

Attila Doroszlai commented on HDDS-7880:
----------------------------------------

Leader election keeps failing with one node stuck in {{STARTING}}:

{code}
2023-11-29 08:58:20,198 
[omNode-bootstrap-1@group-0AAC5367B30E-LeaderElection63] INFO  
impl.LeaderElection (LeaderElection.java:logAndReturn(89)) - 
omNode-bootstrap-1@group-0AAC5367B30E-LeaderElection63: PRE_VOTE REJECTED 
received 0 response(s) and 2 exception(s):
2023-11-29 08:58:20,198 
[omNode-bootstrap-1@group-0AAC5367B30E-LeaderElection63] INFO  
impl.LeaderElection (LogUtils.java:infoOrTrace(136)) -   Exception 0: 
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2023-11-29 08:58:20,198 
[omNode-bootstrap-1@group-0AAC5367B30E-LeaderElection63] INFO  
impl.LeaderElection (LogUtils.java:infoOrTrace(136)) -   Exception 1: 
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
omNode-bootstrap-2@group-0AAC5367B30E is not in [RUNNING]: current state is 
STARTING
{code}

The same happens intermittently in test performing SCM failover:

{code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2023/12/05/27325/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestSecretKeysApi-output.txt}
2023-12-05 20:38:06,036 
[ff269865-5117-4d68-ae84-ea2293ac2bc9@group-BA1D23A26742-LeaderElection61] INFO 
 impl.LeaderElection (LeaderElection.java:logAndReturn(89)) - 
ff269865-5117-4d68-ae84-ea2293ac2bc9@group-BA1D23A26742-LeaderElection61: 
PRE_VOTE REJECTED received 0 response(s) and 2 exception(s):
2023-12-05 20:38:06,036 
[ff269865-5117-4d68-ae84-ea2293ac2bc9@group-BA1D23A26742-LeaderElection61] INFO 
 impl.LeaderElection (LogUtils.java:infoOrTrace(136)) -   Exception 0: 
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2023-12-05 20:38:06,036 
[ff269865-5117-4d68-ae84-ea2293ac2bc9@group-BA1D23A26742-LeaderElection61] INFO 
 impl.LeaderElection (LogUtils.java:infoOrTrace(136)) -   Exception 1: 
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
f4ff81fe-c094-46ab-9b0b-ebf0d3a3c35e@group-BA1D23A26742 is not in [RUNNING]: 
current state is STARTING
{code}

> Intermittent timeout in TestAddRemoveOzoneManager.testBootstrap
> ---------------------------------------------------------------
>
>                 Key: HDDS-7880
>                 URL: https://issues.apache.org/jira/browse/HDDS-7880
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: test
>            Reporter: Attila Doroszlai
>            Priority: Major
>
> {code}
> org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.testBootstrap  Time 
> elapsed: 98.852 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> ...
>   at org.apache.ozone.test.GenericTestUtils.waitFor(GenericTestUtils.java:224)
>   at 
> org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.testBootstrap(TestAddRemoveOzoneManager.java:188)
> {code}
> * 
> https://github.com/adoroszlai/ozone-build-results/blob/master/2022/01/13/12510/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.txt
> * 
> https://github.com/adoroszlai/ozone-build-results/blob/master/2022/02/15/13099/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.txt
> * 
> https://github.com/adoroszlai/ozone-build-results/blob/master/2022/02/26/13444/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.txt
> * 
> https://github.com/adoroszlai/ozone-build-results/blob/master/2022/09/12/17162/it-om/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.txt
> * 
> https://github.com/adoroszlai/ozone-build-results/blob/master/2022/12/12/19028/it-om/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestAddRemoveOzoneManager.txt



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to