xichen01 commented on code in PR #4988:
URL: https://github.com/apache/ozone/pull/4988#discussion_r1411718829
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java:
##########
@@ -437,15 +350,26 @@ public DatanodeDeletedBlockTransactions getTransactions(
throws IOException {
lock.lock();
try {
+ // Here we can clean up the Datanode timeout command that no longer
+ // reports heartbeats
+ getSCMDeletedBlockTransactionStatusManager().cleanAllTimeoutSCMCommand(
Review Comment:
@sumitagrawl Thank you for the further Review.
Our gap may be in the frequency of `SCMBlockDeletingService` execution, in
my understanding `SCMBlockDeletingService` is executed every `60s`.
In my understanding
- `SCMBlockDeletingService timeout`(`OZONE_BLOCK_DELETING_SERVICE_TIMEOUT`):
This is a value used to determine if the Service runtime has timed out, and if
it has, a log is printed
`SCMBlockDeletingService timeout` is only used to print logs, not involved
in other judgments.
```java
if (endTime - startTime > serviceTimeoutInNanos) {
LOG.warn("{} Background task execution took {}ns > {}ns(timeout)",
serviceName, endTime - startTime > serviceTimeoutInNanos) { {
serviceName, endTime - startTime, serviceTimeoutInNanos); }
}
```
- The `block.deleting.service.interval` determines the period at which the
`SCMBlockDeletingService` runs, which defaults to 60s.
- `scmCommandTimeoutMs` is 300s, which means that after SCM sends a delete
transaction 5 times, if there is no response from the previously sent
transaction, it will send a delete again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]