errose28 commented on code in PR #4337:
URL: https://github.com/apache/ozone/pull/4337#discussion_r1134857707
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerIntegrityChecks.java:
##########
@@ -75,6 +75,7 @@ public class TestKeyValueContainerIntegrityChecks {
public TestKeyValueContainerIntegrityChecks(
ContainerTestVersionInfo versionInfo) {
+ LOG.info("new {} for {}", getClass().getSimpleName(), versionInfo);
Review Comment:
Is this supposed to be committed or was this added while debugging?
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerMetadataInspector.java:
##########
@@ -172,7 +269,8 @@ public void inspectThenRepairOnCorrectContainer(
*/
public void inspectThenRepairOnIncorrectContainer(
KeyValueContainerData containerData, int createdBlocks, int setBlocks,
- int setBytes) throws Exception {
+ int setBytes, int deleteCount, long numDeletedLocalIds,
+ boolean shouldRepair) throws Exception {
Review Comment:
New parameters should be added to javadoc
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerMetadataInspector.java:
##########
@@ -392,6 +412,38 @@ private boolean checkAndRepair(JsonObject parent,
errors.add(usedBytesError);
}
+ // check and repair if db delete count mismatches delete transaction count.
+ final JsonElement dbDeleteCountJson = dBMetadata.get(
+ OzoneConsts.PENDING_DELETE_BLOCK_COUNT);
+ final long dbDeleteCount = jsonToLong(dbDeleteCountJson);
+ final JsonElement deleteTxCountJson = aggregates.get(PendingDelete.COUNT);
+ final long deleteTransactionCount = jsonToLong(deleteTxCountJson);
+ if (dbDeleteCount != deleteTransactionCount) {
+ passed = false;
+
+ final BooleanSupplier deleteCountRepairAction = () -> {
+ if (deleteTransactionCount == 0) {
+ // repair only when delete transaction count is 0
Review Comment:
Why only repair when the count is zero? This will give confusing output when
the inspector is run in repair mode because it will flag the mismatched value
in all cases but only repair it in some of them.
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerMetadataInspector.java:
##########
@@ -212,6 +310,7 @@ public void inspectThenRepairOnIncorrectContainer(
private void checkJsonReportForIncorrectContainer(JsonObject inspectJson,
String expectedContainerState, long createdBlocks,
long setBlocks, long createdBytes, long setBytes, long createdFiles,
+ long deleteCount, long deleteTransactions,
Review Comment:
This would be easier to follow if we use the same naming convention as the
other parameters, where artificially set counter values are prefixed by `set`
and the number of individual entries are prefixed by `created`.
`deleteCount` -> `setPendingDeleteCount`
`deleteTransactions` -> `createdPendingDeleteCount`
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerMetadataInspector.java:
##########
@@ -172,7 +269,8 @@ public void inspectThenRepairOnCorrectContainer(
*/
public void inspectThenRepairOnIncorrectContainer(
KeyValueContainerData containerData, int createdBlocks, int setBlocks,
- int setBytes) throws Exception {
+ int setBytes, int deleteCount, long numDeletedLocalIds,
+ boolean shouldRepair) throws Exception {
Review Comment:
Also the `shouldRepair` parameter is unnecessary. This method always tests
both inspect and repair mode. Passing this as false will run the same test
twice.
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerMetadataInspector.java:
##########
@@ -243,11 +342,24 @@ private void
checkJsonReportForIncorrectContainer(JsonObject inspectJson,
// Check errors.
checkJsonErrorsReport(inspectJson, "dBMetadata.#BLOCKCOUNT",
- new JsonPrimitive(createdBlocks), new JsonPrimitive(setBlocks),
- shouldRepair);
+ createdBlocks, setBlocks, shouldRepair);
checkJsonErrorsReport(inspectJson, "dBMetadata.#BYTESUSED",
- new JsonPrimitive(createdBytes), new JsonPrimitive(setBytes),
- shouldRepair);
+ createdBytes, setBytes, shouldRepair);
+ checkJsonErrorsReport(inspectJson, "dBMetadata.#PENDINGDELETEBLOCKCOUNT",
+ deleteCount, deleteTransactions, shouldRepair);
+ }
+
+ private void checkJsonErrorsReport(
+ JsonObject jsonReport, String propertyValue,
+ long correctExpected, long correctActual,
+ boolean correctRepair) {
+ if (correctExpected == correctActual) {
+ return;
+ }
Review Comment:
Also this is just comparing the two expected values. The two actual values
in the report may be different than this and the test would not catch that.
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerMetadataInspector.java:
##########
@@ -392,6 +412,38 @@ private boolean checkAndRepair(JsonObject parent,
errors.add(usedBytesError);
}
+ // check and repair if db delete count mismatches delete transaction count.
+ final JsonElement dbDeleteCountJson = dBMetadata.get(
+ OzoneConsts.PENDING_DELETE_BLOCK_COUNT);
+ final long dbDeleteCount = jsonToLong(dbDeleteCountJson);
+ final JsonElement deleteTxCountJson = aggregates.get(PendingDelete.COUNT);
+ final long deleteTransactionCount = jsonToLong(deleteTxCountJson);
+ if (dbDeleteCount != deleteTransactionCount) {
+ passed = false;
+
+ final BooleanSupplier deleteCountRepairAction = () -> {
+ if (deleteTransactionCount == 0) {
+ // repair only when delete transaction count is 0
+ final String key = containerData.getPendingDeleteBlockCountKey();
+ try {
+ // reset delete block count to 0 in metadata table
+ metadataTable.put(key, 0L);
+ return true;
+ } catch (IOException ex) {
+ LOG.error("Failed to reset {} for container {}.",
+ key, containerData.getContainerID(), ex);
+ }
+ }
+ return false;
+ };
+
+ final JsonObject deleteCountError = buildErrorAndRepair(
+ "dBMetadata." + OzoneConsts.PENDING_DELETE_BLOCK_COUNT,
+ dbDeleteCountJson, deleteTxCountJson,
Review Comment:
Can you look into why the test did not catch this?
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerMetadataInspector.java:
##########
@@ -290,11 +402,53 @@ private void checkJsonErrorsReport(JsonObject jsonReport,
public void setDBBlockAndByteCounts(KeyValueContainerData containerData,
long blockCount, long byteCount) throws Exception {
+ setDB(containerData, blockCount, byteCount,
+ 0, Collections.emptyList());
+ }
+
+ public void setDB(KeyValueContainerData containerData,
+ long blockCount, long byteCount,
+ long dbDeleteCount, List<DeletedBlocksTransaction> deleteTransactions)
+ throws Exception {
try (DBHandle db = BlockUtils.getDB(containerData, getConf())) {
Table<String, Long> metadataTable = db.getStore().getMetadataTable();
// Don't care about in memory state. Just change the DB values.
metadataTable.put(containerData.getBlockCountKey(), blockCount);
metadataTable.put(containerData.getBytesUsedKey(), byteCount);
+ metadataTable.put(containerData.getPendingDeleteBlockCountKey(),
+ dbDeleteCount);
+
+ final DatanodeStore store = db.getStore();
+ LOG.info("store {}", store.getClass().getSimpleName());
+ if (store instanceof DatanodeStoreSchemaTwoImpl) {
Review Comment:
Why do we need a split between schema v2 and v3 here? Looks like they do the
same thing. `containerData.getDeleteTxnKey(t.getTxID())` should work for both
schemas.
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerMetadataInspector.java:
##########
@@ -392,6 +412,38 @@ private boolean checkAndRepair(JsonObject parent,
errors.add(usedBytesError);
}
+ // check and repair if db delete count mismatches delete transaction count.
+ final JsonElement dbDeleteCountJson = dBMetadata.get(
+ OzoneConsts.PENDING_DELETE_BLOCK_COUNT);
+ final long dbDeleteCount = jsonToLong(dbDeleteCountJson);
+ final JsonElement deleteTxCountJson = aggregates.get(PendingDelete.COUNT);
+ final long deleteTransactionCount = jsonToLong(deleteTxCountJson);
+ if (dbDeleteCount != deleteTransactionCount) {
+ passed = false;
+
+ final BooleanSupplier deleteCountRepairAction = () -> {
+ if (deleteTransactionCount == 0) {
+ // repair only when delete transaction count is 0
+ final String key = containerData.getPendingDeleteBlockCountKey();
+ try {
+ // reset delete block count to 0 in metadata table
+ metadataTable.put(key, 0L);
+ return true;
+ } catch (IOException ex) {
+ LOG.error("Failed to reset {} for container {}.",
+ key, containerData.getContainerID(), ex);
+ }
+ }
+ return false;
+ };
+
+ final JsonObject deleteCountError = buildErrorAndRepair(
+ "dBMetadata." + OzoneConsts.PENDING_DELETE_BLOCK_COUNT,
+ dbDeleteCountJson, deleteTxCountJson,
Review Comment:
These parameters are in the wrong order. It should be `expected` (the
aggregate value obtained by iterating the DB) then `actual` (the stored value
in the counter), like a junit `assert` call. This would be easier to see by
keeping the naming conventions used for other fields for these ones as well:
`dbDeleteCountJson` -> `pendingDeleteCountDB`
`deleteTxCountJson` -> `pendingDeleteCountAggregate`
Where "aggregate" implies it was obtained by aggregating individual entries
in RocksDB and is therefor the expected value of the counter.
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerMetadataInspector.java:
##########
@@ -243,11 +342,24 @@ private void
checkJsonReportForIncorrectContainer(JsonObject inspectJson,
// Check errors.
checkJsonErrorsReport(inspectJson, "dBMetadata.#BLOCKCOUNT",
- new JsonPrimitive(createdBlocks), new JsonPrimitive(setBlocks),
- shouldRepair);
+ createdBlocks, setBlocks, shouldRepair);
checkJsonErrorsReport(inspectJson, "dBMetadata.#BYTESUSED",
- new JsonPrimitive(createdBytes), new JsonPrimitive(setBytes),
- shouldRepair);
+ createdBytes, setBytes, shouldRepair);
+ checkJsonErrorsReport(inspectJson, "dBMetadata.#PENDINGDELETEBLOCKCOUNT",
+ deleteCount, deleteTransactions, shouldRepair);
+ }
+
+ private void checkJsonErrorsReport(
+ JsonObject jsonReport, String propertyValue,
+ long correctExpected, long correctActual,
+ boolean correctRepair) {
+ if (correctExpected == correctActual) {
+ return;
+ }
Review Comment:
This method is to check the `errors` array in the json report. This case
would not be hit for anything showing up in the `errors` array, because it
indicates there is no error. I think we can remove this wrapper method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]