Nilotpal Nandi created HDDS-3293:
------------------------------------

             Summary: read operation failing when two container replicas are 
corrupted
                 Key: HDDS-3293
                 URL: https://issues.apache.org/jira/browse/HDDS-3293
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
          Components: Ozone Datanode
            Reporter: Nilotpal Nandi


steps taken :

1) Mounted noise injection FUSE on all datanodes.

2) Write a key ( multi blocks)

3) Select one of the container ids ,  inject error on 2 container replicas for 
that container id.

4) Run GET key operation.

GET key operation fails intermittenly.

Error seen :

-------------

 
{noformat}
20/03/27 18:30:40 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-xceiverclientmetrics.properties,hadoop-metrics2.properties
E 20/03/27 18:30:40 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
period at 10 second(s).
E 20/03/27 18:30:40 INFO impl.MetricsSystemImpl: XceiverClientMetrics metrics 
system started
E 20/03/27 18:31:12 ERROR scm.XceiverClientGrpc: Failed to execute command 
cmdType: ReadChunk
E traceID: "f80a51eaec481a1c:cbb8e92869015a53:f80a51eaec481a1c:0"
E containerID: 67
E datanodeUuid: "96101390-2446-40e6-a54e-36e170497e57"
E readChunk {
E blockID {
E containerID: 67
E localID: 103896435892617248
E blockCommitSequenceId: 1010
E }
E chunkData {
E chunkName: "103896435892617248_chunk_28"
E offset: 113246208
E len: 4194304
E checksumData {
E type: CRC32
E bytesPerChecksum: 1048576
E checksums: "\034\376\313\031"
E checksums: ";U\225\037"
E checksums: "\327m\332."
E checksums: "|\307\004E"
E }
E }
E }
E on the pipeline Pipeline[ Id: bce6316c-9690-452b-80e3-0f3590533444, Nodes: 
96101390-2446-40e6-a54e-36e170497e57{ip: 172.27.111.129, host: 
quasar-olrywk-3.quasar-olrywk.root.hwx.site, networkLocation: /default-rack, 
certSerialId: null}3e85204d-2399-43b5-952a-55b837eb4c1d{ip: 172.27.100.0, host: 
quasar-olrywk-1.quasar-olrywk.root.hwx.site, networkLocation: /default-rack, 
certSerialId: null}5af0340a-6fee-4ce8-9f68-37fa35566a5a{ip: 172.27.73.0, host: 
quasar-olrywk-9.quasar-olrywk.root.hwx.site, networkLocation: /default-rack, 
certSerialId: null}, Type:STAND_ALONE, Factor:THREE, State:OPEN, 
leaderId:96101390-2446-40e6-a54e-36e170497e57, 
CreationTimestamp2020-03-27T03:36:51.880Z].
E Unexpected OzoneException: java.io.IOException: 
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
deadline exceeded after 84603913ns. [remote_addr=/172.27.73.0:9859]]{noformat}
 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to