[ 
https://issues.apache.org/jira/browse/HDDS-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDDS-15204:
-----------------------------------
    Description: 
h1. Problem Statement

The ozone quota repair tool (ozone repair om quota) considers only Active 
Object Store (AOS). It does not update pendingDeleteSnapshotBytes 
pendingDeleteSnapshotNamespace.

The client side quota usedBytes is computed using server side 
usedBytes+pendingDeleteSnapshotBytes, and usedNamespace is computed using 
service side usedNamespace+pendingDeleteSnapshotNamespace. So running ozone 
repair tool against a cluster with Ozone Snapshots may not fix all quota 
calculation problems.
h1. Proposed Implementation Strategy

You can extend the existing QuotaRepairTask.java by adding two new tasks to its 
executor:

1 // Example of how to extend repairCount in QuotaRepairTask.java
2 tasks.add(executor.submit(() -> recalculateSnapshotUsages(
3 metadataManager.getDeletedTable(),
4 snapshotUsedMap, "Snapshot Key usages")));
5
6 tasks.add(executor.submit(() -> recalculateSnapshotUsages(
7 metadataManager.getDeletedDirectoryTable(),
8 snapshotDirUsedMap, "Snapshot Directory usages")));

Detailed recalculateSnapshotUsages logic:
1. Iterate through deletedTable.
2. Extract Bucket ID: For deletedTable, the value is RepeatedOmKeyInfo, which 
has getBucketId().
3. Match: If the bucketId matches the bucket being repaired:
 * snapshotUsedBytes += repeatedKeyInfo.getTotalSize().getRight(); (Replicated 
size)
 * snapshotUsedNamespace += repeatedKeyInfo.getOmKeyInfoList().size();
4. Submit Diffs: Just like usedBytes, send these totals back to the OM via the 
QuotaRepairRequest to update the OmBucketInfo in the BucketTable.

Summary of Source of Truth
┌────────────────────────────────┬──────────────────────────────────────┬───────────────────┐
│ Field │ Source Table in OM DB │ Filter │
├────────────────────────────────┼──────────────────────────────────────┼───────────────────┤
│ pendingDeleteSnapshotBytes │ deletedTable │ Match by bucketId │
│ pendingDeleteSnapshotNamespace │ deletedTable + deletedDirectoryTable │ Match 
by bucketId │
└────────────────────────────────┴──────────────────────────────────────┴───────────────────┘

  was:
h1. Problem Statement

The ozone quota repair tool (ozone repair om quota) considers only Active 
Object Store (AOS). It does not update pendingDeleteSnapshotBytes 
pendingDeleteSnapshotNamespace.

The client side quota usedBytes is computed using server side 
usedBytes+pendingDeleteSnapshotBytes, and usedNamespace is computed using 
service side usedNamespace+pendingDeleteSnapshotNamespace. So running ozone 
repair tool against a cluster with Ozone Snapshots may not fix all quota 
calculation problems.
h1. Proposed Implementation Strategy

You can extend the existing QuotaRepairTask.java by adding two new tasks to its 
executor:

1 // Example of how to extend repairCount in QuotaRepairTask.java
2 tasks.add(executor.submit(() -> recalculateSnapshotUsages(
3 metadataManager.getDeletedTable(),
4 snapshotUsedMap, "Snapshot Key usages")));
5
6 tasks.add(executor.submit(() -> recalculateSnapshotUsages(
7 metadataManager.getDeletedDirectoryTable(),
8 snapshotDirUsedMap, "Snapshot Directory usages")));

Detailed recalculateSnapshotUsages logic:
1. Iterate through deletedTable.
2. Extract Bucket ID: For deletedTable, the value is RepeatedOmKeyInfo, which 
has getBucketId().
3. Match: If the bucketId matches the bucket being repaired:
 * snapshotUsedBytes += repeatedKeyInfo.getTotalSize().getRight(); (Replicated 
size)
 * snapshotUsedNamespace += repeatedKeyInfo.getOmKeyInfoList().size();
4. Submit Diffs: Just like usedBytes, send these totals back to the OM via the 
QuotaRepairRequest to update the OmBucketInfo in the BucketTable.

Summary of Source of Truth
┌────────────────────────────────┬──────────────────────────────────────┬───────────────────┐
│ Field │ Source Table in OM DB │ Filter │
├────────────────────────────────┼──────────────────────────────────────┼───────────────────┤
│ pendingDeleteSnapshotBytes │ deletedTable │ Match by bucketId │
│ pendingDeleteSnapshotNamespace │ deletedTable + deletedDirectoryTable │ Match 
by bucketId │
└────────────────────────────────┴──────────────────────────────────────┴───────────────────┘

Note: This recomputation assumes the active deletedTable is the primary record. 
For absolute precision in complex edge cases (like nested snapshot
deletions), one might need to scan the SnapshotInfoTable and verify keys within 
individual snapshot checkpoints, but scanning the active deletedTable
covers the standard operational requirements for quota repair.


> Ozone quota repair tool to support Ozone Snapshot
> -------------------------------------------------
>
>                 Key: HDDS-15204
>                 URL: https://issues.apache.org/jira/browse/HDDS-15204
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> h1. Problem Statement
> The ozone quota repair tool (ozone repair om quota) considers only Active 
> Object Store (AOS). It does not update pendingDeleteSnapshotBytes 
> pendingDeleteSnapshotNamespace.
> The client side quota usedBytes is computed using server side 
> usedBytes+pendingDeleteSnapshotBytes, and usedNamespace is computed using 
> service side usedNamespace+pendingDeleteSnapshotNamespace. So running ozone 
> repair tool against a cluster with Ozone Snapshots may not fix all quota 
> calculation problems.
> h1. Proposed Implementation Strategy
> You can extend the existing QuotaRepairTask.java by adding two new tasks to 
> its executor:
> 1 // Example of how to extend repairCount in QuotaRepairTask.java
> 2 tasks.add(executor.submit(() -> recalculateSnapshotUsages(
> 3 metadataManager.getDeletedTable(),
> 4 snapshotUsedMap, "Snapshot Key usages")));
> 5
> 6 tasks.add(executor.submit(() -> recalculateSnapshotUsages(
> 7 metadataManager.getDeletedDirectoryTable(),
> 8 snapshotDirUsedMap, "Snapshot Directory usages")));
> Detailed recalculateSnapshotUsages logic:
> 1. Iterate through deletedTable.
> 2. Extract Bucket ID: For deletedTable, the value is RepeatedOmKeyInfo, which 
> has getBucketId().
> 3. Match: If the bucketId matches the bucket being repaired:
>  * snapshotUsedBytes += repeatedKeyInfo.getTotalSize().getRight(); 
> (Replicated size)
>  * snapshotUsedNamespace += repeatedKeyInfo.getOmKeyInfoList().size();
> 4. Submit Diffs: Just like usedBytes, send these totals back to the OM via 
> the QuotaRepairRequest to update the OmBucketInfo in the BucketTable.
> Summary of Source of Truth
> ┌────────────────────────────────┬──────────────────────────────────────┬───────────────────┐
> │ Field │ Source Table in OM DB │ Filter │
> ├────────────────────────────────┼──────────────────────────────────────┼───────────────────┤
> │ pendingDeleteSnapshotBytes │ deletedTable │ Match by bucketId │
> │ pendingDeleteSnapshotNamespace │ deletedTable + deletedDirectoryTable │ 
> Match by bucketId │
> └────────────────────────────────┴──────────────────────────────────────┴───────────────────┘



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to