[ 
https://issues.apache.org/jira/browse/HDDS-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-6061:
-----------------------------------
    Summary: Peer datanode cannot add group for pipeline in secure env   (was: 
Intermittent failure to write data in secure env due to async pipeline creation)

> Peer datanode cannot add group for pipeline in secure env 
> ----------------------------------------------------------
>
>                 Key: HDDS-6061
>                 URL: https://issues.apache.org/jira/browse/HDDS-6061
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Datanode
>            Reporter: Attila Doroszlai
>            Assignee: Attila Doroszlai
>            Priority: Major
>              Labels: pull-request-available
>
> Extracted from HDDS-3907.
> Secure acceptance tests intermittently fail at test cases where data is being 
> written.
> https://github.com/elek/ozone-build-results/tree/master/2021/08/19/9810/acceptance-secure
>  for logs.
> {code:title=https://github.com/apache/ozone/runs/3368353893#step:5:126}
> Start freon testing                                                   | FAIL |
> {code}
> {code:title=robot log.html}
> 07:19:23.258  INFO    Running command 'ozone freon randomkeys 
> --num-of-volumes 5 --num-of-buckets 5 --num-of-keys 5 --num-of-threads 1 
> --replication-type RATIS --factor THREE --validate-writes 2>&1'.       
> 07:24:23.225  FAIL    Test timeout 5 minutes exceeded.
> {code}
> {code}
> datanode_3  | 2021-08-19 05:20:09,598 
> [java.util.concurrent.ThreadPoolExecutor$Worker@5f5ccab7[State = -1, empty 
> queue]] WARN server.GrpcLogAppender: 
> 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899->25dd9de7-1caa-448d-a35a-2b29afced1cc-GrpcLogAppender:
>   appendEntries Timeout, 
> request=AppendEntriesRequest:cid=8,entriesCount=1,lastEntry=(t:3, i:0)
> ...
> datanode_3  | 2021-08-19 05:23:56,577 [Thread-181] INFO 
> client.GrpcClientProtocolService: Failed 
> RaftClientRequest:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899,
>  cid=102, seq=0, Watch-ALL_COMMITTED(131), Message:<EMPTY>, 
> reply=RaftClientReply:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899,
>  cid=102, FAILED org.apache.ratis.protocol.exceptions.NotReplicatedException: 
> Request with call Id 102 and log index 131 is not yet replicated to 
> ALL_COMMITTED, logIndex=131, 
> commits[1c7f86b2-ded3-441b-9f20-84ba3ff60d2d:c132, 
> 64230e6f-d613-4ced-8084-22c404c29d15:c132, 
> 25dd9de7-1caa-448d-a35a-2b29afced1cc:c127]
> {code}
> {code}
> datanode_2  | 2021-08-19 05:18:42,242 [Command processor thread] WARN 
> commandhandler.CreatePipelineCommandHandler: Add group failed for 
> 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d{ip: 172.18.0.9, host: 
> ozonesecure_datanode_3.ozonesecure_default, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default-rack, certSerialId: null, persistedOpState: 
> IN_SERVICE, persistedOpStateExpiryEpochSec: 0}
> datanode_2  | java.io.IOException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: 
> Network closed for unknown reason
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to