[ 
https://issues.apache.org/jira/browse/HDDS-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Gui updated HDDS-6890:
---------------------------
    Description: 
We've encountered a potential issue on the read path when we have 
over-replicated EC container replicas after Offline Recovery.

Here is the scenario:
 # 20 DNs, EC 10+4
 # A background script keep writing and reading keys.(freon ockg & ockv)
 # A background script that randomly kills DNs (< 4 for each round), wait for 
all containers to recover to the expected number of replicas(14). Then go for 
next round.
 # After there are 8 killed DNs, restart all dead DNs before.
 # Then we have potentially over-replicated containers.
 # At that time, there are concurrent ockv and container delete commands in the 
cluster.
 # And the ockv reports a digest mismatch during the short period of several 
minutes.
 # But after all the deletions are done and all containers return to an 
expected number of replicas(14), the ockv command returns successfully.

Outputs for the ockv command below:
{code:java}
2022-06-10 13:21:27,923 [main] INFO storage.BlockInputStream: Unable to read 
information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
pipeline PipelineID=35db70e6-f362-4b3a-ab98-a89509547455: ContainerID 12 does 
not exist
2022-06-10 13:21:28,613 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:29,614 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:30,614 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:31,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:32,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:33,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:33,978 [pool-2-thread-1] ERROR scm.XceiverClientGrpc: Failed 
to execute command cmdType: GetBlock
traceID: ""
containerID: 12
datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
getBlock {
  blockID {
    containerID: 12
    localID: 109611004723202307
    blockCommitSequenceId: 0
  }
}
 on the pipeline Pipeline[ Id: 8151bbc7-672c-42d6-9c40-c80ce69d4b3f, Nodes: 
5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, ports: 
[REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
CreationTimestamp2022-06-10T13:21:33.921+08:00[Asia/Shanghai]].
2022-06-10 13:21:34,014 [pool-2-thread-1] INFO storage.BlockInputStream: Unable 
to read information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
pipeline PipelineID=8151bbc7-672c-42d6-9c40-c80ce69d4b3f: ContainerID 12 does 
not exist
2022-06-10 13:21:34,142 [pool-2-thread-5] ERROR scm.XceiverClientGrpc: Failed 
to execute command cmdType: GetBlock
traceID: ""
containerID: 12
datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
getBlock {
  blockID {
    containerID: 12
    localID: 109611004723202316
    blockCommitSequenceId: 0
  }
}
 on the pipeline Pipeline[ Id: f421fc76-eebb-4474-8c77-c57944144df6, Nodes: 
5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, ports: 
[REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
CreationTimestamp2022-06-10T13:21:34.074+08:00[Asia/Shanghai]].
2022-06-10 13:21:34,143 [pool-2-thread-5] INFO storage.BlockInputStream: Unable 
to read information for block conID: 12 locID: 109611004723202316 bcsId: 0 from 
pipeline PipelineID=f421fc76-eebb-4474-8c77-c57944144df6: ContainerID 12 does 
not exist
2022-06-10 13:21:34,315 [pool-2-thread-7] ERROR scm.XceiverClientGrpc: Failed 
to execute command cmdType: GetBlock
traceID: ""
containerID: 12
datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
getBlock {
  blockID {
    containerID: 12
    localID: 109611004723202310
    blockCommitSequenceId: 0
  }
}
 on the pipeline Pipeline[ Id: 21b59767-0ea9-4214-9f56-61fd31d795f2, Nodes: 
5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, ports: 
[REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
CreationTimestamp2022-06-10T13:21:34.215+08:00[Asia/Shanghai]].
2022-06-10 13:21:34,315 [pool-2-thread-7] INFO storage.BlockInputStream: Unable 
to read information for block conID: 12 locID: 109611004723202310 bcsId: 0 from 
pipeline PipelineID=21b59767-0ea9-4214-9f56-61fd31d795f2: ContainerID 12 does 
not exist
2022-06-10 13:21:35,402 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:36,418 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:37,419 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:38,001 [pool-2-thread-5] ERROR freon.BaseFreonGenerator: Error 
on executing task 4
java.lang.IllegalStateException: Reference (=first) message digest doesn't 
match with digest of stripe_align_1654804614/4
        at 
org.apache.hadoop.ozone.freon.OzoneClientKeyValidator.validateDigest(OzoneClientKeyValidator.java:169)
        at 
org.apache.hadoop.ozone.freon.OzoneClientKeyValidator.validateKey(OzoneClientKeyValidator.java:133)
        at 
org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:184)
        at 
org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:164)
        at 
org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$1(BaseFreonGenerator.java:147)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2022-06-10 13:21:38,419 [Thread-4] INFO freon.ProgressBar: Progress: 10.00 % (1 
out of 10)
2022-06-10 13:21:39,419 [Thread-4] INFO freon.ProgressBar: Progress: 100.00 % 
(10 out of 10)
One ore more freon test is failed.
2022-06-10 13:21:39,823 [shutdown-hook-0] INFO metrics: type=TIMER, 
name=key-validate, count=10, min=5918.240321, max=7710.826968, 
mean=6891.752193143025, stddev=481.39096683740087, median=6798.957973, 
p75=7302.544457, p95=7710.826968, p98=7710.826968, p99=7710.826968, 
p999=7710.826968, mean_rate=1.1586365637687632, m1=0.0, m5=0.0, m15=0.0, 
rate_unit=events/second, duration_unit=milliseconds
2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: Total 
execution time (sec): 22
2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
Failures: 1
2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
Successful executions: 9
 [check_object] size=52428800 prefix=stripe_align_1654804614 fail!!!! {code}
I've got some analysis for this problem below:

The key clue of this problem is the following log:
{code:java}
2022-06-10 13:21:27,923 [main] INFO storage.BlockInputStream: Unable to read 
information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
pipeline PipelineID=35db70e6-f362-4b3a-ab98-a89509547455: ContainerID 12 does 
not exist {code}
This should be caused by a container delete command received before the 
ReadChunk request reached the DN.

And `BlockInputStream` will try to refresh the pipeline and retry read, see:
{code:java}
// BlockInputStream.java
private void refreshPipeline(IOException cause) throws IOException {
  LOG.info("Unable to read information for block {} from pipeline {}: {}",
      blockID, pipeline.getId(), cause.getMessage());
  if (refreshPipelineFunction != null) {
    LOG.debug("Re-fetching pipeline for block {}", blockID);
    Pipeline newPipeline = refreshPipelineFunction.apply(blockID);
    if (newPipeline == null || newPipeline.sameDatanodes(pipeline)) {
      LOG.warn("No new pipeline for block {}", blockID);
      throw cause;
    } else {
      LOG.debug("New pipeline got for block {}", blockID);
      this.pipeline = newPipeline;
    }
  } else {
    throw cause;
  }
} {code}
But actually here the pipeline we try to refresh from SCM is a {*}Standalone 
pipeline{*}, so after SCM got this request via the SCM interface 
`getContainerWithPipelineBatch`, it will return a EC pipeline with all the DNs 
for this container in the class `ECPipelineProvider`, see:
{code:java}
// ECPipelineProvider.java
@Override public Pipeline createForRead(
    ECReplicationConfig replicationConfig,
    Set<ContainerReplica> replicas) {
  Map<DatanodeDetails, Integer> map = new HashMap<>();
  List<DatanodeDetails> dns = new ArrayList<>(replicas.size());

  for (ContainerReplica r : replicas) {
    map.put(r.getDatanodeDetails(), r.getReplicaIndex());
    dns.add(r.getDatanodeDetails());
  }
  return createPipelineInternal(replicationConfig, dns, map);
}{code}
So for the retry read, it will pick the first node to read regardless of the 
replicaIndex, so we will get wrong chunks from the first node to read and thus 
the mismatch problem above, see:
{code:java}
// XceiverClientGrpc.java
public void connect() throws Exception {
  // connect to the closest node, if closest node doesn't exist, delegate to
  // first node, which is usually the leader in the pipeline.
  DatanodeDetails dn = topologyAwareRead ? this.pipeline.getClosestNode() :
      this.pipeline.getFirstNode();
  // just make a connection to the picked datanode at the beginning
  connectToDatanode(dn);
}{code}
 

  was:
We've encountered a potential issue on the read path when we have 
over-replicated EC container replicas after Offline Recovery.

Here is the scenario:
 # 20 DNs, EC 10+4
 # A background script keep writing and reading keys.(freon ockg & ockv)
 # A background script that randomly kills DNs (< 4 for each round), wait for 
all containers to recover to the expected number of replicas(14). Then go for 
next round.
 # After there are 8 killed DNs, restart all dead DNs before.
 # Then we have potentially over-replicated containers.
 # At that time, there are concurrent ockv and container delete commands in the 
cluster.
 # And the ockv reports a digest mismatch during the short period of several 
minutes.
 # But after all the deletions are done and all containers return to an 
expected number of replicas(14), the ockv command returns successfully.

Outputs for the ockv command below:
{code:java}
2022-06-10 13:21:27,923 [main] INFO storage.BlockInputStream: Unable to read 
information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
pipeline PipelineID=35db70e6-f362-4b3a-ab98-a89509547455: ContainerID 12 does 
not exist
2022-06-10 13:21:28,613 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:29,614 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:30,614 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:31,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:32,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:33,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:33,978 [pool-2-thread-1] ERROR scm.XceiverClientGrpc: Failed 
to execute command cmdType: GetBlock
traceID: ""
containerID: 12
datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
getBlock {
  blockID {
    containerID: 12
    localID: 109611004723202307
    blockCommitSequenceId: 0
  }
}
 on the pipeline Pipeline[ Id: 8151bbc7-672c-42d6-9c40-c80ce69d4b3f, Nodes: 
5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, ports: 
[REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
CreationTimestamp2022-06-10T13:21:33.921+08:00[Asia/Shanghai]].
2022-06-10 13:21:34,014 [pool-2-thread-1] INFO storage.BlockInputStream: Unable 
to read information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
pipeline PipelineID=8151bbc7-672c-42d6-9c40-c80ce69d4b3f: ContainerID 12 does 
not exist
2022-06-10 13:21:34,142 [pool-2-thread-5] ERROR scm.XceiverClientGrpc: Failed 
to execute command cmdType: GetBlock
traceID: ""
containerID: 12
datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
getBlock {
  blockID {
    containerID: 12
    localID: 109611004723202316
    blockCommitSequenceId: 0
  }
}
 on the pipeline Pipeline[ Id: f421fc76-eebb-4474-8c77-c57944144df6, Nodes: 
5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, ports: 
[REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
CreationTimestamp2022-06-10T13:21:34.074+08:00[Asia/Shanghai]].
2022-06-10 13:21:34,143 [pool-2-thread-5] INFO storage.BlockInputStream: Unable 
to read information for block conID: 12 locID: 109611004723202316 bcsId: 0 from 
pipeline PipelineID=f421fc76-eebb-4474-8c77-c57944144df6: ContainerID 12 does 
not exist
2022-06-10 13:21:34,315 [pool-2-thread-7] ERROR scm.XceiverClientGrpc: Failed 
to execute command cmdType: GetBlock
traceID: ""
containerID: 12
datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
getBlock {
  blockID {
    containerID: 12
    localID: 109611004723202310
    blockCommitSequenceId: 0
  }
}
 on the pipeline Pipeline[ Id: 21b59767-0ea9-4214-9f56-61fd31d795f2, Nodes: 
5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, ports: 
[REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
CreationTimestamp2022-06-10T13:21:34.215+08:00[Asia/Shanghai]].
2022-06-10 13:21:34,315 [pool-2-thread-7] INFO storage.BlockInputStream: Unable 
to read information for block conID: 12 locID: 109611004723202310 bcsId: 0 from 
pipeline PipelineID=21b59767-0ea9-4214-9f56-61fd31d795f2: ContainerID 12 does 
not exist
2022-06-10 13:21:35,402 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:36,418 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:37,419 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % (0 
out of 10)
2022-06-10 13:21:38,001 [pool-2-thread-5] ERROR freon.BaseFreonGenerator: Error 
on executing task 4
java.lang.IllegalStateException: Reference (=first) message digest doesn't 
match with digest of stripe_align_1654804614/4
        at 
org.apache.hadoop.ozone.freon.OzoneClientKeyValidator.validateDigest(OzoneClientKeyValidator.java:169)
        at 
org.apache.hadoop.ozone.freon.OzoneClientKeyValidator.validateKey(OzoneClientKeyValidator.java:133)
        at 
org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:184)
        at 
org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:164)
        at 
org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$1(BaseFreonGenerator.java:147)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2022-06-10 13:21:38,419 [Thread-4] INFO freon.ProgressBar: Progress: 10.00 % (1 
out of 10)
2022-06-10 13:21:39,419 [Thread-4] INFO freon.ProgressBar: Progress: 100.00 % 
(10 out of 10)
One ore more freon test is failed.
2022-06-10 13:21:39,823 [shutdown-hook-0] INFO metrics: type=TIMER, 
name=key-validate, count=10, min=5918.240321, max=7710.826968, 
mean=6891.752193143025, stddev=481.39096683740087, median=6798.957973, 
p75=7302.544457, p95=7710.826968, p98=7710.826968, p99=7710.826968, 
p999=7710.826968, mean_rate=1.1586365637687632, m1=0.0, m5=0.0, m15=0.0, 
rate_unit=events/second, duration_unit=milliseconds
2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: Total 
execution time (sec): 22
2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
Failures: 1
2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
Successful executions: 9
 [check_object] size=52428800 prefix=stripe_align_1654804614 fail!!!! {code}
I've got some analysis for this problem below:

The key clue of this problem is the following log:
{code:java}
2022-06-10 13:21:27,923 [main] INFO storage.BlockInputStream: Unable to read 
information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
pipeline PipelineID=35db70e6-f362-4b3a-ab98-a89509547455: ContainerID 12 does 
not exist {code}
This should be caused by a container delete command received before the 
ReadChunk request reached the DN.

And `BlockInputStream` will try to refresh the pipeline and retry read, see:
{code:java}
// BlockInputStream.java
private void refreshPipeline(IOException cause) throws IOException {
  LOG.info("Unable to read information for block {} from pipeline {}: {}",
      blockID, pipeline.getId(), cause.getMessage());
  if (refreshPipelineFunction != null) {
    LOG.debug("Re-fetching pipeline for block {}", blockID);
    Pipeline newPipeline = refreshPipelineFunction.apply(blockID);
    if (newPipeline == null || newPipeline.sameDatanodes(pipeline)) {
      LOG.warn("No new pipeline for block {}", blockID);
      throw cause;
    } else {
      LOG.debug("New pipeline got for block {}", blockID);
      this.pipeline = newPipeline;
    }
  } else {
    throw cause;
  }
} {code}
But actually here the pipeline we try to refresh from SCM is a {*}Standalone 
pipeline{*}, so after SCM got this request via the SCM interface 
`getContainerWithPipelineBatch`, it will return a strange Standalone pipeline 
with all the DNs for this container in the class `SimplePipelineProvider`, see:
{code:java}
// SimplePipelineProvider.java
@Override
public Pipeline createForRead(StandaloneReplicationConfig replicationConfig,
    Set<ContainerReplica> replicas) {
  return create(replicationConfig, replicas
      .stream()
      .map(ContainerReplica::getDatanodeDetails)
      .collect(Collectors.toList()));
} {code}
So for the retry read, it will pick the first node to read regardless of the 
replicaIndex, so we will get wrong chunks from the first node to read and thus 
the mismatch problem above, see:
{code:java}
// XceiverClientGrpc.java
public void connect() throws Exception {
  // connect to the closest node, if closest node doesn't exist, delegate to
  // first node, which is usually the leader in the pipeline.
  DatanodeDetails dn = topologyAwareRead ? this.pipeline.getClosestNode() :
      this.pipeline.getFirstNode();
  // just make a connection to the picked datanode at the beginning
  connectToDatanode(dn);
}{code}
 


> EC: Potential read digest mismatch with over-replicated container.
> ------------------------------------------------------------------
>
>                 Key: HDDS-6890
>                 URL: https://issues.apache.org/jira/browse/HDDS-6890
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Mark Gui
>            Priority: Major
>
> We've encountered a potential issue on the read path when we have 
> over-replicated EC container replicas after Offline Recovery.
> Here is the scenario:
>  # 20 DNs, EC 10+4
>  # A background script keep writing and reading keys.(freon ockg & ockv)
>  # A background script that randomly kills DNs (< 4 for each round), wait for 
> all containers to recover to the expected number of replicas(14). Then go for 
> next round.
>  # After there are 8 killed DNs, restart all dead DNs before.
>  # Then we have potentially over-replicated containers.
>  # At that time, there are concurrent ockv and container delete commands in 
> the cluster.
>  # And the ockv reports a digest mismatch during the short period of several 
> minutes.
>  # But after all the deletions are done and all containers return to an 
> expected number of replicas(14), the ockv command returns successfully.
> Outputs for the ockv command below:
> {code:java}
> 2022-06-10 13:21:27,923 [main] INFO storage.BlockInputStream: Unable to read 
> information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
> pipeline PipelineID=35db70e6-f362-4b3a-ab98-a89509547455: ContainerID 12 does 
> not exist
> 2022-06-10 13:21:28,613 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:29,614 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:30,614 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:31,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:32,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:33,922 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:33,978 [pool-2-thread-1] ERROR scm.XceiverClientGrpc: Failed 
> to execute command cmdType: GetBlock
> traceID: ""
> containerID: 12
> datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
> getBlock {
>   blockID {
>     containerID: 12
>     localID: 109611004723202307
>     blockCommitSequenceId: 0
>   }
> }
>  on the pipeline Pipeline[ Id: 8151bbc7-672c-42d6-9c40-c80ce69d4b3f, Nodes: 
> 5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, 
> ports: [REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
> STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
> ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
> CreationTimestamp2022-06-10T13:21:33.921+08:00[Asia/Shanghai]].
> 2022-06-10 13:21:34,014 [pool-2-thread-1] INFO storage.BlockInputStream: 
> Unable to read information for block conID: 12 locID: 109611004723202307 
> bcsId: 0 from pipeline PipelineID=8151bbc7-672c-42d6-9c40-c80ce69d4b3f: 
> ContainerID 12 does not exist
> 2022-06-10 13:21:34,142 [pool-2-thread-5] ERROR scm.XceiverClientGrpc: Failed 
> to execute command cmdType: GetBlock
> traceID: ""
> containerID: 12
> datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
> getBlock {
>   blockID {
>     containerID: 12
>     localID: 109611004723202316
>     blockCommitSequenceId: 0
>   }
> }
>  on the pipeline Pipeline[ Id: f421fc76-eebb-4474-8c77-c57944144df6, Nodes: 
> 5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, 
> ports: [REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
> STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
> ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
> CreationTimestamp2022-06-10T13:21:34.074+08:00[Asia/Shanghai]].
> 2022-06-10 13:21:34,143 [pool-2-thread-5] INFO storage.BlockInputStream: 
> Unable to read information for block conID: 12 locID: 109611004723202316 
> bcsId: 0 from pipeline PipelineID=f421fc76-eebb-4474-8c77-c57944144df6: 
> ContainerID 12 does not exist
> 2022-06-10 13:21:34,315 [pool-2-thread-7] ERROR scm.XceiverClientGrpc: Failed 
> to execute command cmdType: GetBlock
> traceID: ""
> containerID: 12
> datanodeUuid: "5b542d9f-5e91-4cc2-af48-af3c33b8c472"
> getBlock {
>   blockID {
>     containerID: 12
>     localID: 109611004723202310
>     blockCommitSequenceId: 0
>   }
> }
>  on the pipeline Pipeline[ Id: 21b59767-0ea9-4214-9f56-61fd31d795f2, Nodes: 
> 5b542d9f-5e91-4cc2-af48-af3c33b8c472{ip: 9.27.142.97, host: 9.27.142.97, 
> ports: [REPLICATION=3096, RATIS=3092, RATIS_ADMIN=3095, RATIS_SERVER=3094, 
> STANDALONE=3093], networkLocation: /default-rack, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 
> ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, 
> CreationTimestamp2022-06-10T13:21:34.215+08:00[Asia/Shanghai]].
> 2022-06-10 13:21:34,315 [pool-2-thread-7] INFO storage.BlockInputStream: 
> Unable to read information for block conID: 12 locID: 109611004723202310 
> bcsId: 0 from pipeline PipelineID=21b59767-0ea9-4214-9f56-61fd31d795f2: 
> ContainerID 12 does not exist
> 2022-06-10 13:21:35,402 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:36,418 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:37,419 [Thread-4] INFO freon.ProgressBar: Progress: 0.00 % 
> (0 out of 10)
> 2022-06-10 13:21:38,001 [pool-2-thread-5] ERROR freon.BaseFreonGenerator: 
> Error on executing task 4
> java.lang.IllegalStateException: Reference (=first) message digest doesn't 
> match with digest of stripe_align_1654804614/4
>         at 
> org.apache.hadoop.ozone.freon.OzoneClientKeyValidator.validateDigest(OzoneClientKeyValidator.java:169)
>         at 
> org.apache.hadoop.ozone.freon.OzoneClientKeyValidator.validateKey(OzoneClientKeyValidator.java:133)
>         at 
> org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:184)
>         at 
> org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:164)
>         at 
> org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$1(BaseFreonGenerator.java:147)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2022-06-10 13:21:38,419 [Thread-4] INFO freon.ProgressBar: Progress: 10.00 % 
> (1 out of 10)
> 2022-06-10 13:21:39,419 [Thread-4] INFO freon.ProgressBar: Progress: 100.00 % 
> (10 out of 10)
> One ore more freon test is failed.
> 2022-06-10 13:21:39,823 [shutdown-hook-0] INFO metrics: type=TIMER, 
> name=key-validate, count=10, min=5918.240321, max=7710.826968, 
> mean=6891.752193143025, stddev=481.39096683740087, median=6798.957973, 
> p75=7302.544457, p95=7710.826968, p98=7710.826968, p99=7710.826968, 
> p999=7710.826968, mean_rate=1.1586365637687632, m1=0.0, m5=0.0, m15=0.0, 
> rate_unit=events/second, duration_unit=milliseconds
> 2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
> Total execution time (sec): 22
> 2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
> Failures: 1
> 2022-06-10 13:21:39,824 [shutdown-hook-0] INFO freon.BaseFreonGenerator: 
> Successful executions: 9
>  [check_object] size=52428800 prefix=stripe_align_1654804614 fail!!!! {code}
> I've got some analysis for this problem below:
> The key clue of this problem is the following log:
> {code:java}
> 2022-06-10 13:21:27,923 [main] INFO storage.BlockInputStream: Unable to read 
> information for block conID: 12 locID: 109611004723202307 bcsId: 0 from 
> pipeline PipelineID=35db70e6-f362-4b3a-ab98-a89509547455: ContainerID 12 does 
> not exist {code}
> This should be caused by a container delete command received before the 
> ReadChunk request reached the DN.
> And `BlockInputStream` will try to refresh the pipeline and retry read, see:
> {code:java}
> // BlockInputStream.java
> private void refreshPipeline(IOException cause) throws IOException {
>   LOG.info("Unable to read information for block {} from pipeline {}: {}",
>       blockID, pipeline.getId(), cause.getMessage());
>   if (refreshPipelineFunction != null) {
>     LOG.debug("Re-fetching pipeline for block {}", blockID);
>     Pipeline newPipeline = refreshPipelineFunction.apply(blockID);
>     if (newPipeline == null || newPipeline.sameDatanodes(pipeline)) {
>       LOG.warn("No new pipeline for block {}", blockID);
>       throw cause;
>     } else {
>       LOG.debug("New pipeline got for block {}", blockID);
>       this.pipeline = newPipeline;
>     }
>   } else {
>     throw cause;
>   }
> } {code}
> But actually here the pipeline we try to refresh from SCM is a {*}Standalone 
> pipeline{*}, so after SCM got this request via the SCM interface 
> `getContainerWithPipelineBatch`, it will return a EC pipeline with all the 
> DNs for this container in the class `ECPipelineProvider`, see:
> {code:java}
> // ECPipelineProvider.java
> @Override public Pipeline createForRead(
>     ECReplicationConfig replicationConfig,
>     Set<ContainerReplica> replicas) {
>   Map<DatanodeDetails, Integer> map = new HashMap<>();
>   List<DatanodeDetails> dns = new ArrayList<>(replicas.size());
>   for (ContainerReplica r : replicas) {
>     map.put(r.getDatanodeDetails(), r.getReplicaIndex());
>     dns.add(r.getDatanodeDetails());
>   }
>   return createPipelineInternal(replicationConfig, dns, map);
> }{code}
> So for the retry read, it will pick the first node to read regardless of the 
> replicaIndex, so we will get wrong chunks from the first node to read and 
> thus the mismatch problem above, see:
> {code:java}
> // XceiverClientGrpc.java
> public void connect() throws Exception {
>   // connect to the closest node, if closest node doesn't exist, delegate to
>   // first node, which is usually the leader in the pipeline.
>   DatanodeDetails dn = topologyAwareRead ? this.pipeline.getClosestNode() :
>       this.pipeline.getFirstNode();
>   // just make a connection to the picked datanode at the beginning
>   connectToDatanode(dn);
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to