Yashaswini G A created HDDS-14936:
-------------------------------------
Summary: Container data checksum reverted by
BackgroundContainerDataScanner after successful reconciliation
Key: HDDS-14936
URL: https://issues.apache.org/jira/browse/HDDS-14936
Project: Apache Ozone
Issue Type: Bug
Components: Ozone Datanode
Reporter: Yashaswini G A
After corrupting all three Ratis replicas of a CLOSED container on different
block files {{ozone admin container reconcile }} was run. Reconciliation
completed on a datanode with peers, reporting corrupt chunks repaired and
updated data checksum ( 76950a80 -> 914f24e4). Shortly afterward,
BackgroundContainerDataScanner on the same datanode logged CORRUPT_CHUNK for a
.block file, OzoneChecksumException on read, and updated the container data
checksum again ( 914f24e4 -> 76950a80). Later scans again flipped the checksum.
{{ozone admin container reconcile --status}} showed replicasMatch=false with
one replica still on the older checksum.
h2. Steps to reproduce (high level)
# Close container; note per-replica dataChecksum (three-way mismatch after
corruption).
# Run {{{}ozone admin container reconcile <containerID>{}}}.
# Observe DN logs: ReconcileContainerTask / KeyValueHandler reports successful
reconcile and checksum update.
# Within seconds/minutes, observe ContainerDataScanner logs on same DN:
CORRUPT_CHUNK, checksum updated in opposite direction.
# Optionally poll {{ozone admin container reconcile <id> --status}} and
observe replicasMatch=false and lingering checksum divergence on one replica.
h2. Expected behavior
After a successful reconcile reporting corrupt chunks repaired and a stable
data checksum aligned with peers, background data scan should not report the
same chunk as corrupt and should not revert the container data checksum unless
there is a documented second source of truth.
h2. Actual behavior
Reconcile reports DONE and checksum aligned to peers;
BackgroundContainerDataScanner then reports CORRUPT_CHUNK and updates data
checksum away from the post-reconcile value
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]