Ethan Rose created HDDS-13245:
---------------------------------
Summary: Container scanner needs to account for deleted blocks
when building the merkle tree
Key: HDDS-13245
URL: https://issues.apache.org/jira/browse/HDDS-13245
Project: Apache Ozone
Issue Type: Sub-task
Reporter: Aswin Shakil
Assignee: Ethan Rose
Currently the container scanner rebuilds the merkle tree based only on what it
sees on disk. This will cause the data checksum to change when blocks are
deleted, which is not desirable. The scanner should also consult the list of
deleted blocks already stored in the checksum file and add those to the tree it
generates.
There are a few parts to this change:
* Change the persisted list of deleted blocks from a list of IDs to a list
{{BlockMerkleTree}} protos so that we also have the checksum there to
reference, instead of trying to find it in the previous tree.
** The block checksum used here should be built from the chunk checksums in
RocksDB, not the block itself, otherwise it might diverge between replicas if
deleting a corrupted block.
** We can clear out the chunk children from the {{BlockMerkleTree}} before
adding it to this list to save space. They will never be read.
* When reconciling, if a peer has marked a block as deleted and we have not,
but we also don't have the block, we need to add that block and its checksum to
our deleted block list.
** This allows checksums to converge for containers that had blocks deleted
before upgrading to reconciliation.
* When reconciling, if we have a checksum mismatch with a peer's deleted
block, log a warning.
** If our copy is corrupted, it is expected to be deleted and that will
resolve the issue.
** If our copy is deleted and the checksum doesn't match, we cannot resolve
the issue.
*** If the checksum written at the time of delete doesn't match, it means it
didn't match in RocksDB either, which is already unrecoverable.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]