[ 
https://issues.apache.org/jira/browse/HDDS-15650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDDS-15650.
------------------------------------
    Fix Version/s: 2.3.0
       Resolution: Fixed

> Fix snapshotUsedNamespace underflow when FSO directory is deleted and purged
> ----------------------------------------------------------------------------
>
>                 Key: HDDS-15650
>                 URL: https://issues.apache.org/jira/browse/HDDS-15650
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Ryan Blough
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.3.0
>
>
> Ryan posted in HDDS-14435:
>  
> I found what look like some problems that are actually on the 
> snapshotUsedNamespace side. Looks like two areas where there is a problem:
>  # In OmKeyDeleteRequestWithFSO.java, it looks like we put a tombstone on the 
> directory, but the corresponding snapshotUsedNamespace update is skipped 
> through the empty key check:
> {code:java}
>       long quotaReleased = sumBlockLengths(omKeyInfo);
>       // Empty entries won't be added to deleted table so this key shouldn't 
> get added to snapshotUsed space.
>       boolean isKeyNonEmpty = !OmKeyInfo.isKeyEmpty(omKeyInfo);
>       omBucketInfo.decrUsedBytes(quotaReleased, isKeyNonEmpty);
> ->    omBucketInfo.decrUsedNamespace(1L, isKeyNonEmpty);{code}
> [https://github.com/apache/ozone/blob/cb29f193ea1d1945a8b908fbf8ee1a39bc00e5e4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyDeleteRequestWithFSO.java#L161-L165]
> This is the only namespace update in the file, so it looks to me like any any 
> time a user deletes a directory it doesn't get reflected in the 
> snapshotUsedNamespace because directories don't have blocks.
> 2. In OMDirectoriesPurgeRequestWithFSO.java, it looks like we always 
> decrement snapshotUsedNamespace on purge, with no attendant checks of whether 
> there even is a snapshot:
> {code:java}
>         if (path.hasDeletedDir()) {
>           deletedDirNames.add(path.getDeletedDir());
>           BucketNameInfo bucketNameInfo = volumeBucketIdMap.get(new 
> VolumeBucketId(path.getVolumeId(),
>               path.getBucketId()));
>           OmBucketInfo omBucketInfo = getBucketInfo(omMetadataManager,
>               bucketNameInfo.getVolumeName(), bucketNameInfo.getBucketName());
>           if (omBucketInfo != null && omBucketInfo.getObjectID() == 
> path.getBucketId()) {
> -->         omBucketInfo.purgeSnapshotUsedNamespace(1);
>             volBucketInfoMap.put(Pair.of(omBucketInfo.getVolumeName(), 
> omBucketInfo.getBucketName()), omBucketInfo);
>           }
>           numDirsDeleted++;
>         } {code}
> [https://github.com/apache/ozone/blob/cb29f193ea1d1945a8b908fbf8ee1a39bc00e5e4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java#L202]
> As a consequence, it's common for snapshotUsedNamespace to be hugely negative 
> on buckets with no snapshots at all. For example, I know of a large cluster 
> with ~200 DNs with 68 buckets that have negative-value snapshotUsedNamespace 
> values, and only 6 specific buckets in the cluster have snapshots.
> These wrong snapshotUsedNamespace counts will be reflected in the bucket info 
> output the value it displays is the sum of AOS usedNamespace + 
> snapshotUsedNamespace, so this problem wrecks namespace quotas even on FSO 
> buckets without snapshots:
> {code:java}
> public long getTotalBucketNamespace() {
>   return usedNamespace + snapshotUsedNamespace;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to