Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2026-04-08 Thread via GitHub


sadanand48 commented on PR #9079:
URL: https://github.com/apache/ozone/pull/9079#issuecomment-4204910760

   @jojochuang This is not needed now, already handled in other jiras. Closing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2026-04-08 Thread via GitHub


sadanand48 closed pull request #9079: HDDS-13007. Snapshot diff computation 
changes after snapshots have been defragged
URL: https://github.com/apache/ozone/pull/9079


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2026-04-07 Thread via GitHub


Copilot commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r3049102758


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 // TODO: [SNAPSHOT] Refactor the parameter list
 Optional> deltaFiles = Optional.empty();
 
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+  if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can 
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these 
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
+// get the source snapshot SST files (versionMeta)  corresponding to 
target snapshot
+OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
+resolveBaseVersionMeta(toSnapshotLocalData, 
fromSnapshot.getSnapshotID());
+// Calculate diff files using helper method
+if (toSnapVersionMeta == null) {
+  String errMsg =
+  "Cannot find corresponding version of from snapshot " + 
fromSnapshotVersion + " from " + tsInfo;
+  LOG.error(errMsg);
+  throw new IOException(errMsg);
+}
+List diffFiles = calculateDiffFiles(fromSnapVersionMeta, 
toSnapVersionMeta);
+return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
fsInfo).toString(),
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
tsInfo).toString(), diffDir);

Review Comment:
   `getDeltaFiles` now unconditionally reads snapshot-local YAML via 
`getSnapshotLocalData(fsInfo/tsInfo)` before entering the try/catch and before 
checking `fsInfo/tsInfo` for null. If either SnapshotInfo is null (which this 
method already guards for later) or if YAML loading throws, the exception will 
propagate and prevent the intended fallback to DAG/full-diff paths. Consider 
moving these calls inside the try block and guarding them with `fsInfo != null 
&& tsInfo != null` so failures fall back cleanly.
   ```suggestion
   try {
 if (fsInfo != null && tsInfo != null) {
   OmSnapshotLocalData fromSnapshotLocalData = 
getSnapshotLocalData(fsInfo);
   int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
   OmSnapshotLocalData toSnapshotLocalData = 
getSnapshotLocalData(tsInfo);
   int toSnapshotVersion = toSnapshotLocalData.getVersion();
   
   if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
 // both snapshots are defragmented, To calculate snap-diff, we can 
simply compare the
 // SST files contained in OmSnapshotLocalData instances of both 
these and get the delta files
 OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
 
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
 // get the source snapshot SST files (versionMeta)  corresponding 
to target snapshot
 OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
 resolveBaseVersionMeta(toSnapshotLocalData, 
fromSnapshot.getSnapshotID());
 // Calculate diff files using helper method
 if (toSnapVersionMeta == null) {
   String errMsg =
   "Cannot find corresponding version of from snapshot " + 
fromSnapshotVersion + " from " + tsInfo;
   LOG.error(errMsg);
   throw new IOException(errMsg);
 }
 List diffFiles = calculateDiffFiles(fromSnapVersionMeta, 
toSnapVersionMeta);
 return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
 
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
fsInfo).toString(),
 
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
tsInfo).toString(), diffDir);
   }
   ```



##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the r

[PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2026-03-30 Thread via GitHub


sadanand48 opened a new pull request, #9079:
URL: https://github.com/apache/ozone/pull/9079

   ## What changes were proposed in this pull request?
   If both snapshots are defragmented, the full diff will be used. If the 
snapshot versions don't match then traverse from the target snapshot to reach 
the source snapshot version from which it was built upon and determine the 
delta files.
   If any error occurs during this phase such as inability to find the sst 
files in the backup directory , it will fallback to existing methods.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-13007
   
   ## How was this patch tested?
   Unit test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-11-24 Thread via GitHub


SaketaChalamchala commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2558399573


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit && 
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+  numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+  LOG.error("Exceeded the traversal limit of {} while finding the 
VersionMeta of the fromSnapshot : {}" +
+  " that the toSnapshot was built on.", maxSnapshotLimit, 
fromSnapshotId);
+  return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager, 
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta = 
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild = 
childVersionMeta.getPreviousSnapshotVersion();
+return 
fromSnapshot.getVersionSstFileInfos().get(versionUsedWhenBuildingChild);
+  }
+
+  /**
+   * Calculates the symmetric difference of SST files between two version 
metadata.
+   * Returns a list of SST file names that are present in one snapshot but not 
in the other,
+   * i.e., the symmetric difference between the two sets of SST files.
+   *
+   * @param fromSnapVersionMeta Version metadata from the first snapshot; 
provides the set of SST files for comparison.
+   * @param toSnapVersionMeta Version metadata from the second snapshot; 
provides the set of SST files for comparison.
+   * @return List of SST file names that are present in only one of the two 
snapshots (symmetric difference).
+   */
+  private List calculateDiffFiles(OmSnapshotLocalData.VersionMeta 
fromSnapVersionMeta,
+  OmSnapshotLocalData.VersionMeta 
toSnapVersionMeta) {
+// Extract SST file names from both snapshots
+Set fromSnapSstFileNames = 
fromSnapVersionMeta.getSstFiles().stream()
+.map(SstFileInfo::getFileName)
+.collect(Collectors.toSet());
+
+Set toSnapSstFileNames = toSnapVersionMeta.getSstFiles().stream()
+.map(SstFileInfo::getFileName)
+.collect(Collectors.toSet());
+
+List diffFiles = new ArrayList<>();
+
+// Files in fromSnapshot but NOT in toSnapshot
+diffFiles.addAll(fromSnapSstFileNames.stream()
+.filter(file -> !toSnapSstFileNames.contains(file))
+.collect(Collectors.toList()));

Review Comment:
   Why would we need this filter if `toSnap` is always built from `fromSnap`. 
Wouldn't `toSnap = fromSnap + diff`. In that case, `fromSnap` should never 
contain SST files that are not in `toSNap` making this filter unnecessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-11-24 Thread via GitHub


SaketaChalamchala commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2558384961


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit && 
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+  numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+  LOG.error("Exceeded the traversal limit of {} while finding the 
VersionMeta of the fromSnapshot : {}" +
+  " that the toSnapshot was built on.", maxSnapshotLimit, 
fromSnapshotId);
+  return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager, 
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta = 
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild = 
childVersionMeta.getPreviousSnapshotVersion();

Review Comment:
   In this chain `fromSnapshot -> fs-1 ->  -> toSnapshot` doesn't this 
logic still return the VersionMeta of `fs-1`? Is there a test case covering 
this?
   
   Would it be possible to make the naming conventions in this loop more clear 
for readability? And would a `do..while` loop make more sense here? for ex.,
   ```
   parent = child = toSnapshotData;
   parentVer = childVer = toSnapshotData.getVersion();
   numSnapshotsTraversed = 0
   
   do {
 if (numSnapshotsTraversed >= maxSnapshotLimit) { // log error and return 
null}
 child = parent;
 childVer = parentVer;
 parent = getSnapshotLocalData(...child.getPreviousSnapID());
 parentVer = child.getVersionMeta(childVer).getPreviousSnapVersion();
 numSnapshotsTraversed++;
   } while (parent.getSnapshotId() != fromSnapshotId);
   
   return parent.getVersionMeta(parentVer);
   
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-11-24 Thread via GitHub


SaketaChalamchala commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2558384961


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit && 
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+  numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+  LOG.error("Exceeded the traversal limit of {} while finding the 
VersionMeta of the fromSnapshot : {}" +
+  " that the toSnapshot was built on.", maxSnapshotLimit, 
fromSnapshotId);
+  return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager, 
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta = 
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild = 
childVersionMeta.getPreviousSnapshotVersion();

Review Comment:
   In this chain `fromSnapshot -> fs-1 ->  -> toSnapshot` doesn't this 
logic still return the VersionMeta of `fs-1`? Is there a test case covering 
this?
   
   Would it be possible to make the naming conventions in this loop more clear 
for readability? And would a `do..while` loop make more sense here? for ex.,
   ```
   parent = child = toSnapshotData 
   parentVer = childVer = toSnapshotData.getVersion()
   numSnapshotsTraversed = 0
   
   do {
 if (numSnapshotsTraversed >= maxSnapshotLimit) { // log error and return 
null}
 child = parent;
 childVer = parentVer;
 parent = getSnapshotLocalData(...child.getPreviousSnapID());
 parentVer = child.getVersionMeta(childVer).getPreviousSnapVersion();
 numSnapshotsTraversed++;
   } while (parent.getSnapshotId() != fromSnapshotId);
   
   return parent.getVersionMeta(parentVer);
   
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-11-24 Thread via GitHub


github-actions[bot] commented on PR #9079:
URL: https://github.com/apache/ozone/pull/9079#issuecomment-3573218106

   Thank you for your contribution. This PR is being closed due to inactivity. 
If needed, feel free to reopen it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-11-24 Thread via GitHub


github-actions[bot] closed pull request #9079: HDDS-13007. Snapshot diff 
computation changes after snapshots have been defragged
URL: https://github.com/apache/ozone/pull/9079


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-11-10 Thread via GitHub


github-actions[bot] commented on PR #9079:
URL: https://github.com/apache/ozone/pull/9079#issuecomment-3514416809

   This PR has been marked as stale due to 21 days of inactivity. Please 
comment or remove the stale label to keep it open. Otherwise, it will be 
automatically closed in 7 days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-18 Thread via GitHub


swamirishi commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413442520


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit && 
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+  numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+  LOG.error("Exceeded the traversal limit of {} while finding the 
VersionMeta of the fromSnapshot : {}" +
+  " that the toSnapshot was built on.", maxSnapshotLimit, 
fromSnapshotId);
+  return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager, 
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta = 
child.getVersionSstFileInfos().get(child.getVersion());

Review Comment:
   This logic of finding childVersionMeta is wrong. We need to update the 
version inside the loop.
   



##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 // TODO: [SNAPSHOT] Refactor the parameter list
 Optional> deltaFiles = Optional.empty();
 
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+  if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {

Review Comment:
   We should not check both. Just checking toSnapshotVersion should be good 
enough as the defrag service would always run from the beginning of the chain



##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 // TODO: [SNAPSHOT] Refactor the parameter list
 Optional> deltaFiles = Optional.empty();
 
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+  if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can 
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these 
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);

Review Comment:
   Let's create a function for this in OmSnapshotLocalData



##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnaps

Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-18 Thread via GitHub


sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2409693717


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1265,70 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   *
+   * Traverses the snapshot chain backwards using prevSnapId.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+while (!child.getPreviousSnapshotId().equals(fromSnapshotId)) {

Review Comment:
   I can add a max  number snapshots to traverse count here. What is a good 
value to have for it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-18 Thread via GitHub


sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413900470


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 // TODO: [SNAPSHOT] Refactor the parameter list
 Optional> deltaFiles = Optional.empty();
 
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+  if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can 
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these 
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
+// get the source snapshot SST files (versionMeta)  corresponding to 
target snapshot
+OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
+resolveBaseVersionMeta(toSnapshotLocalData, 
fromSnapshot.getSnapshotID());
+// Calculate diff files using helper method
+if (toSnapVersionMeta == null) {
+  String errMsg =
+  "Cannot find corresponding version of from snapshot " + 
fromSnapshotVersion + " from " + tsInfo;
+  LOG.error(errMsg);
+  throw new IOException(errMsg);
+}
+List diffFiles = calculateDiffFiles(fromSnapVersionMeta, 
toSnapVersionMeta);
+return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
fsInfo).toString(),
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
tsInfo).toString(), diffDir);
+  }
+} catch (Exception e) {
+  LOG.error("Failed to calculate snap-diff between fromSnapshot : {} , 
toSnapshot: {} via optimal method," +
+  "Falling back to other methods", fsInfo, tsInfo, e);
+}
+
 // Check if compaction DAG is available, use that if so
 if (differ != null && fsInfo != null && tsInfo != null && !useFullDiff) {

Review Comment:
   do we need to change DAG based diff code ? I think it already works on 
version=0 files. If first if statement fails , this is the exact flow we have 
today too. First DAG based diff is tried, if it fails it fallsback to full diff



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-18 Thread via GitHub


sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2409703845


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1265,70 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   *
+   * Traverses the snapshot chain backwards using prevSnapId.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+while (!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager, 
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta = 
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild = 
childVersionMeta.getPreviousSnapshotVersion();
+return 
fromSnapshot.getVersionSstFileInfos().get(versionUsedWhenBuildingChild);

Review Comment:
   generally shouldn't happen. If it does, The method would return null in that 
case and it would fall back to use other methods to find the diff 
   ```java
if (toSnapVersionMeta == null) {
 String errMsg =
 "Cannot find corresponding version of from snapshot " + 
fromSnapshotVersion + " from " + tsInfo;
 LOG.error(errMsg);
 throw new IOException(errMsg);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-17 Thread via GitHub


sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413904596


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit && 
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+  numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {

Review Comment:
   https://github.com/apache/ozone/pull/9079#discussion_r2408854902 added this 
based on this comment. We don't know if there could be bugs in code. it could 
be a safety condition



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-17 Thread via GitHub


swamirishi commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413462516


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 // TODO: [SNAPSHOT] Refactor the parameter list
 Optional> deltaFiles = Optional.empty();
 
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+  if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can 
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these 
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
+// get the source snapshot SST files (versionMeta)  corresponding to 
target snapshot
+OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
+resolveBaseVersionMeta(toSnapshotLocalData, 
fromSnapshot.getSnapshotID());
+// Calculate diff files using helper method
+if (toSnapVersionMeta == null) {
+  String errMsg =
+  "Cannot find corresponding version of from snapshot " + 
fromSnapshotVersion + " from " + tsInfo;
+  LOG.error(errMsg);
+  throw new IOException(errMsg);
+}
+List diffFiles = calculateDiffFiles(fromSnapVersionMeta, 
toSnapVersionMeta);
+return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
fsInfo).toString(),
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(), 
tsInfo).toString(), diffDir);
+  }
+} catch (Exception e) {
+  LOG.error("Failed to calculate snap-diff between fromSnapshot : {} , 
toSnapshot: {} via optimal method," +
+  "Falling back to other methods", fsInfo, tsInfo, e);
+}
+
 // Check if compaction DAG is available, use that if so
 if (differ != null && fsInfo != null && tsInfo != null && !useFullDiff) {

Review Comment:
   We need rocksdb checkpoint differ to also use the SnapshotLocalMetadata 
instead of the rocksdb instance itself.
   There are 3 possible conditions:
   
   1. fromSnapshot is defragged and toSNapshot Defragged
   2. fromSnapshot is not defragged and toSnapshot not defragged
   3. fromSnapshot is Defragged and toSnapshot not Defragged
   4. fromSnapshot is not Defragged and toSnapshot Defragged(This is not 
possible.)
   This patch covers only the first condition so far. 2 and 3 condition should 
fall back to dag based diff by using both version 0's sst files to compute diff.
   Effectively logic the logic for snapshot diff should be like this :
   ```
   if (toSnapshot.getVersion() > 0) {
  perfromFullDiff(fromSnapshot(version=correspoding version from traversing 
linked list), toSnapshot(version=toSnapshotVersion));
   } else {
if (Dag available) {
 performDagDiff(fromSnapshot(version=0), toSnapshot(version=0));
   } else {
peformFullDiff(fromSnapshot(version=0), toSnapshot(version=0))
   }
   }
   }
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-17 Thread via GitHub


sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2409815939


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1265,70 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   *
+   * Traverses the snapshot chain backwards using prevSnapId.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+while (!child.getPreviousSnapshotId().equals(fromSnapshotId)) {

Review Comment:
   used the existin default config here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]

2025-10-08 Thread via GitHub


sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413893088


##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
 toSnapshot.getSnapshotTableKey()));
   }
 
+  /**
+   * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+   * that the given snapshot (toSnapshot) was built on.
+   * Traverses the snapshot chain backwards using prevSnapId.
+   *
+   * @param toSnapshot the target snapshot.
+   * @param fromSnapshotId UUID of the source snapshot.
+   * @return the resolved VersionMeta of the source that was used to build the 
target.
+   */
+  private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+  OmSnapshotLocalData toSnapshot,
+  UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit && 
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+  UUID parentId = child.getPreviousSnapshotId();
+  // Load the parent snapshot in the chain
+  child = getSnapshotLocalData(
+  getSnapshotInfo(ozoneManager,
+  metadataManager.getSnapshotChainManager(),
+  parentId));
+  numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+  LOG.error("Exceeded the traversal limit of {} while finding the 
VersionMeta of the fromSnapshot : {}" +
+  " that the toSnapshot was built on.", maxSnapshotLimit, 
fromSnapshotId);
+  return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager, 
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta = 
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild = 
childVersionMeta.getPreviousSnapshotVersion();

Review Comment:
   Yeah this looks is simpler but think it's the same but just getting 
calculated in loop now and doesn't need extra handlimg outside the loop.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]