Rui Wang created HDDS-4708:
------------------------------
Summary: Optimization: update RetryCount less frequently (update
once per ~100)
Key: HDDS-4708
URL: https://issues.apache.org/jira/browse/HDDS-4708
Project: Hadoop Distributed Data Store
Issue Type: Improvement
Reporter: Rui Wang
Assignee: Rui Wang
SCM maintains a DeleteBlockTransaction table [1]. For each transaction record
in this table, there is a retry count [2]. This retry count increases every
time when SCM retries the delete transaction and until it exceeds the maximum
limit, then SCM stops retrying and admin can analyze why some blocks fail to
delete.
Because the count is written into DB every time upon retries, I want to discuss
whether it is worth an optimization that we can maintain the retry count as an
in-memory state and we only write to DB when the retry count exceeds the limit
(thus to leave for further analysis).
The reason for this idea is in SCM HA we are replicating DB changes over Ratis,
and still persist retry count for every increase will have 3x cost compared to
now.
The drawback of only updating retrycount at the limit is, if SCM restart at a
time, the retry count will be cleared and restart to count.
[1]:
https://github.com/apache/ozone/blob/master/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStore.java#L70
[2]:
https://github.com/apache/ozone/blob/master/hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto#L331
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]