Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on PR #9079: URL: https://github.com/apache/ozone/pull/9079#issuecomment-4204910760 @jojochuang This is not needed now, already handled in other jiras. Closing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 closed pull request #9079: HDDS-13007. Snapshot diff computation changes after snapshots have been defragged URL: https://github.com/apache/ozone/pull/9079 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
Copilot commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r3049102758
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
// TODO: [SNAPSHOT] Refactor the parameter list
Optional> deltaFiles = Optional.empty();
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+ if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
+// get the source snapshot SST files (versionMeta) corresponding to
target snapshot
+OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
+resolveBaseVersionMeta(toSnapshotLocalData,
fromSnapshot.getSnapshotID());
+// Calculate diff files using helper method
+if (toSnapVersionMeta == null) {
+ String errMsg =
+ "Cannot find corresponding version of from snapshot " +
fromSnapshotVersion + " from " + tsInfo;
+ LOG.error(errMsg);
+ throw new IOException(errMsg);
+}
+List diffFiles = calculateDiffFiles(fromSnapVersionMeta,
toSnapVersionMeta);
+return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
fsInfo).toString(),
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
tsInfo).toString(), diffDir);
Review Comment:
`getDeltaFiles` now unconditionally reads snapshot-local YAML via
`getSnapshotLocalData(fsInfo/tsInfo)` before entering the try/catch and before
checking `fsInfo/tsInfo` for null. If either SnapshotInfo is null (which this
method already guards for later) or if YAML loading throws, the exception will
propagate and prevent the intended fallback to DAG/full-diff paths. Consider
moving these calls inside the try block and guarding them with `fsInfo != null
&& tsInfo != null` so failures fall back cleanly.
```suggestion
try {
if (fsInfo != null && tsInfo != null) {
OmSnapshotLocalData fromSnapshotLocalData =
getSnapshotLocalData(fsInfo);
int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
OmSnapshotLocalData toSnapshotLocalData =
getSnapshotLocalData(tsInfo);
int toSnapshotVersion = toSnapshotLocalData.getVersion();
if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
// both snapshots are defragmented, To calculate snap-diff, we can
simply compare the
// SST files contained in OmSnapshotLocalData instances of both
these and get the delta files
OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
// get the source snapshot SST files (versionMeta) corresponding
to target snapshot
OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
resolveBaseVersionMeta(toSnapshotLocalData,
fromSnapshot.getSnapshotID());
// Calculate diff files using helper method
if (toSnapVersionMeta == null) {
String errMsg =
"Cannot find corresponding version of from snapshot " +
fromSnapshotVersion + " from " + tsInfo;
LOG.error(errMsg);
throw new IOException(errMsg);
}
List diffFiles = calculateDiffFiles(fromSnapVersionMeta,
toSnapVersionMeta);
return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
fsInfo).toString(),
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
tsInfo).toString(), diffDir);
}
```
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the r
[PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 opened a new pull request, #9079: URL: https://github.com/apache/ozone/pull/9079 ## What changes were proposed in this pull request? If both snapshots are defragmented, the full diff will be used. If the snapshot versions don't match then traverse from the target snapshot to reach the source snapshot version from which it was built upon and determine the delta files. If any error occurs during this phase such as inability to find the sst files in the backup directory , it will fallback to existing methods. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-13007 ## How was this patch tested? Unit test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
SaketaChalamchala commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2558399573
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit &&
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+ numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+ LOG.error("Exceeded the traversal limit of {} while finding the
VersionMeta of the fromSnapshot : {}" +
+ " that the toSnapshot was built on.", maxSnapshotLimit,
fromSnapshotId);
+ return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager,
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta =
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild =
childVersionMeta.getPreviousSnapshotVersion();
+return
fromSnapshot.getVersionSstFileInfos().get(versionUsedWhenBuildingChild);
+ }
+
+ /**
+ * Calculates the symmetric difference of SST files between two version
metadata.
+ * Returns a list of SST file names that are present in one snapshot but not
in the other,
+ * i.e., the symmetric difference between the two sets of SST files.
+ *
+ * @param fromSnapVersionMeta Version metadata from the first snapshot;
provides the set of SST files for comparison.
+ * @param toSnapVersionMeta Version metadata from the second snapshot;
provides the set of SST files for comparison.
+ * @return List of SST file names that are present in only one of the two
snapshots (symmetric difference).
+ */
+ private List calculateDiffFiles(OmSnapshotLocalData.VersionMeta
fromSnapVersionMeta,
+ OmSnapshotLocalData.VersionMeta
toSnapVersionMeta) {
+// Extract SST file names from both snapshots
+Set fromSnapSstFileNames =
fromSnapVersionMeta.getSstFiles().stream()
+.map(SstFileInfo::getFileName)
+.collect(Collectors.toSet());
+
+Set toSnapSstFileNames = toSnapVersionMeta.getSstFiles().stream()
+.map(SstFileInfo::getFileName)
+.collect(Collectors.toSet());
+
+List diffFiles = new ArrayList<>();
+
+// Files in fromSnapshot but NOT in toSnapshot
+diffFiles.addAll(fromSnapSstFileNames.stream()
+.filter(file -> !toSnapSstFileNames.contains(file))
+.collect(Collectors.toList()));
Review Comment:
Why would we need this filter if `toSnap` is always built from `fromSnap`.
Wouldn't `toSnap = fromSnap + diff`. In that case, `fromSnap` should never
contain SST files that are not in `toSNap` making this filter unnecessary?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
SaketaChalamchala commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2558384961
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit &&
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+ numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+ LOG.error("Exceeded the traversal limit of {} while finding the
VersionMeta of the fromSnapshot : {}" +
+ " that the toSnapshot was built on.", maxSnapshotLimit,
fromSnapshotId);
+ return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager,
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta =
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild =
childVersionMeta.getPreviousSnapshotVersion();
Review Comment:
In this chain `fromSnapshot -> fs-1 -> -> toSnapshot` doesn't this
logic still return the VersionMeta of `fs-1`? Is there a test case covering
this?
Would it be possible to make the naming conventions in this loop more clear
for readability? And would a `do..while` loop make more sense here? for ex.,
```
parent = child = toSnapshotData;
parentVer = childVer = toSnapshotData.getVersion();
numSnapshotsTraversed = 0
do {
if (numSnapshotsTraversed >= maxSnapshotLimit) { // log error and return
null}
child = parent;
childVer = parentVer;
parent = getSnapshotLocalData(...child.getPreviousSnapID());
parentVer = child.getVersionMeta(childVer).getPreviousSnapVersion();
numSnapshotsTraversed++;
} while (parent.getSnapshotId() != fromSnapshotId);
return parent.getVersionMeta(parentVer);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
SaketaChalamchala commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2558384961
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit &&
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+ numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+ LOG.error("Exceeded the traversal limit of {} while finding the
VersionMeta of the fromSnapshot : {}" +
+ " that the toSnapshot was built on.", maxSnapshotLimit,
fromSnapshotId);
+ return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager,
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta =
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild =
childVersionMeta.getPreviousSnapshotVersion();
Review Comment:
In this chain `fromSnapshot -> fs-1 -> -> toSnapshot` doesn't this
logic still return the VersionMeta of `fs-1`? Is there a test case covering
this?
Would it be possible to make the naming conventions in this loop more clear
for readability? And would a `do..while` loop make more sense here? for ex.,
```
parent = child = toSnapshotData
parentVer = childVer = toSnapshotData.getVersion()
numSnapshotsTraversed = 0
do {
if (numSnapshotsTraversed >= maxSnapshotLimit) { // log error and return
null}
child = parent;
childVer = parentVer;
parent = getSnapshotLocalData(...child.getPreviousSnapID());
parentVer = child.getVersionMeta(childVer).getPreviousSnapVersion();
numSnapshotsTraversed++;
} while (parent.getSnapshotId() != fromSnapshotId);
return parent.getVersionMeta(parentVer);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
github-actions[bot] commented on PR #9079: URL: https://github.com/apache/ozone/pull/9079#issuecomment-3573218106 Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
github-actions[bot] closed pull request #9079: HDDS-13007. Snapshot diff computation changes after snapshots have been defragged URL: https://github.com/apache/ozone/pull/9079 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
github-actions[bot] commented on PR #9079: URL: https://github.com/apache/ozone/pull/9079#issuecomment-3514416809 This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
swamirishi commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413442520
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit &&
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+ numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+ LOG.error("Exceeded the traversal limit of {} while finding the
VersionMeta of the fromSnapshot : {}" +
+ " that the toSnapshot was built on.", maxSnapshotLimit,
fromSnapshotId);
+ return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager,
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta =
child.getVersionSstFileInfos().get(child.getVersion());
Review Comment:
This logic of finding childVersionMeta is wrong. We need to update the
version inside the loop.
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
// TODO: [SNAPSHOT] Refactor the parameter list
Optional> deltaFiles = Optional.empty();
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+ if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
Review Comment:
We should not check both. Just checking toSnapshotVersion should be good
enough as the defrag service would always run from the beginning of the chain
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
// TODO: [SNAPSHOT] Refactor the parameter list
Optional> deltaFiles = Optional.empty();
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+ if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
Review Comment:
Let's create a function for this in OmSnapshotLocalData
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnaps
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2409693717
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1265,70 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ *
+ * Traverses the snapshot chain backwards using prevSnapId.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+while (!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
Review Comment:
I can add a max number snapshots to traverse count here. What is a good
value to have for it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413900470
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
// TODO: [SNAPSHOT] Refactor the parameter list
Optional> deltaFiles = Optional.empty();
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+ if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
+// get the source snapshot SST files (versionMeta) corresponding to
target snapshot
+OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
+resolveBaseVersionMeta(toSnapshotLocalData,
fromSnapshot.getSnapshotID());
+// Calculate diff files using helper method
+if (toSnapVersionMeta == null) {
+ String errMsg =
+ "Cannot find corresponding version of from snapshot " +
fromSnapshotVersion + " from " + tsInfo;
+ LOG.error(errMsg);
+ throw new IOException(errMsg);
+}
+List diffFiles = calculateDiffFiles(fromSnapVersionMeta,
toSnapVersionMeta);
+return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
fsInfo).toString(),
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
tsInfo).toString(), diffDir);
+ }
+} catch (Exception e) {
+ LOG.error("Failed to calculate snap-diff between fromSnapshot : {} ,
toSnapshot: {} via optimal method," +
+ "Falling back to other methods", fsInfo, tsInfo, e);
+}
+
// Check if compaction DAG is available, use that if so
if (differ != null && fsInfo != null && tsInfo != null && !useFullDiff) {
Review Comment:
do we need to change DAG based diff code ? I think it already works on
version=0 files. If first if statement fails , this is the exact flow we have
today too. First DAG based diff is tried, if it fails it fallsback to full diff
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2409703845
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1265,70 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ *
+ * Traverses the snapshot chain backwards using prevSnapId.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+while (!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager,
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta =
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild =
childVersionMeta.getPreviousSnapshotVersion();
+return
fromSnapshot.getVersionSstFileInfos().get(versionUsedWhenBuildingChild);
Review Comment:
generally shouldn't happen. If it does, The method would return null in that
case and it would fall back to use other methods to find the diff
```java
if (toSnapVersionMeta == null) {
String errMsg =
"Cannot find corresponding version of from snapshot " +
fromSnapshotVersion + " from " + tsInfo;
LOG.error(errMsg);
throw new IOException(errMsg);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413904596
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit &&
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+ numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
Review Comment:
https://github.com/apache/ozone/pull/9079#discussion_r2408854902 added this
based on this comment. We don't know if there could be bugs in code. it could
be a safety condition
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
swamirishi commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413462516
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1180,6 +1199,37 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
// TODO: [SNAPSHOT] Refactor the parameter list
Optional> deltaFiles = Optional.empty();
+OmSnapshotLocalData fromSnapshotLocalData = getSnapshotLocalData(fsInfo);
+int fromSnapshotVersion = fromSnapshotLocalData.getVersion();
+OmSnapshotLocalData toSnapshotLocalData = getSnapshotLocalData(tsInfo);
+int toSnapshotVersion = toSnapshotLocalData.getVersion();
+
+try {
+ if (fromSnapshotVersion > 0 && toSnapshotVersion > 0) {
+// both snapshots are defragmented, To calculate snap-diff, we can
simply compare the
+// SST files contained in OmSnapshotLocalData instances of both these
and get the delta files
+OmSnapshotLocalData.VersionMeta toSnapVersionMeta =
+
toSnapshotLocalData.getVersionSstFileInfos().get(toSnapshotVersion);
+// get the source snapshot SST files (versionMeta) corresponding to
target snapshot
+OmSnapshotLocalData.VersionMeta fromSnapVersionMeta =
+resolveBaseVersionMeta(toSnapshotLocalData,
fromSnapshot.getSnapshotID());
+// Calculate diff files using helper method
+if (toSnapVersionMeta == null) {
+ String errMsg =
+ "Cannot find corresponding version of from snapshot " +
fromSnapshotVersion + " from " + tsInfo;
+ LOG.error(errMsg);
+ throw new IOException(errMsg);
+}
+List diffFiles = calculateDiffFiles(fromSnapVersionMeta,
toSnapVersionMeta);
+return OmSnapshotUtils.getSSTDiffListWithFullPath(diffFiles,
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
fsInfo).toString(),
+
OmSnapshotManager.getSnapshotPath(ozoneManager.getMetadataManager(),
tsInfo).toString(), diffDir);
+ }
+} catch (Exception e) {
+ LOG.error("Failed to calculate snap-diff between fromSnapshot : {} ,
toSnapshot: {} via optimal method," +
+ "Falling back to other methods", fsInfo, tsInfo, e);
+}
+
// Check if compaction DAG is available, use that if so
if (differ != null && fsInfo != null && tsInfo != null && !useFullDiff) {
Review Comment:
We need rocksdb checkpoint differ to also use the SnapshotLocalMetadata
instead of the rocksdb instance itself.
There are 3 possible conditions:
1. fromSnapshot is defragged and toSNapshot Defragged
2. fromSnapshot is not defragged and toSnapshot not defragged
3. fromSnapshot is Defragged and toSnapshot not Defragged
4. fromSnapshot is not Defragged and toSnapshot Defragged(This is not
possible.)
This patch covers only the first condition so far. 2 and 3 condition should
fall back to dag based diff by using both version 0's sst files to compute diff.
Effectively logic the logic for snapshot diff should be like this :
```
if (toSnapshot.getVersion() > 0) {
perfromFullDiff(fromSnapshot(version=correspoding version from traversing
linked list), toSnapshot(version=toSnapshotVersion));
} else {
if (Dag available) {
performDagDiff(fromSnapshot(version=0), toSnapshot(version=0));
} else {
peformFullDiff(fromSnapshot(version=0), toSnapshot(version=0))
}
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2409815939
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1265,70 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ *
+ * Traverses the snapshot chain backwards using prevSnapId.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+while (!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
Review Comment:
used the existin default config here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] HDDS-13007. Snapshot diff computation changes after snapshots have been defragged [ozone]
sadanand48 commented on code in PR #9079:
URL: https://github.com/apache/ozone/pull/9079#discussion_r2413893088
##
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##
@@ -1223,6 +1273,81 @@ Set getDeltaFiles(OmSnapshot fromSnapshot,
toSnapshot.getSnapshotTableKey()));
}
+ /**
+ * Resolve the VersionMeta of the ancestor snapshot (fromSnapshotId)
+ * that the given snapshot (toSnapshot) was built on.
+ * Traverses the snapshot chain backwards using prevSnapId.
+ *
+ * @param toSnapshot the target snapshot.
+ * @param fromSnapshotId UUID of the source snapshot.
+ * @return the resolved VersionMeta of the source that was used to build the
target.
+ */
+ private OmSnapshotLocalData.VersionMeta resolveBaseVersionMeta(
+ OmSnapshotLocalData toSnapshot,
+ UUID fromSnapshotId) throws IOException {
+OmMetadataManagerImpl metadataManager =
+(OmMetadataManagerImpl) ozoneManager.getMetadataManager();
+// Start walking back from the child snapshot
+OmSnapshotLocalData child = toSnapshot;
+int numSnapshotsTraversed = 0;
+while (numSnapshotsTraversed <= maxSnapshotLimit &&
!child.getPreviousSnapshotId().equals(fromSnapshotId)) {
+ UUID parentId = child.getPreviousSnapshotId();
+ // Load the parent snapshot in the chain
+ child = getSnapshotLocalData(
+ getSnapshotInfo(ozoneManager,
+ metadataManager.getSnapshotChainManager(),
+ parentId));
+ numSnapshotsTraversed++;
+}
+if (numSnapshotsTraversed > maxSnapshotLimit) {
+ LOG.error("Exceeded the traversal limit of {} while finding the
VersionMeta of the fromSnapshot : {}" +
+ " that the toSnapshot was built on.", maxSnapshotLimit,
fromSnapshotId);
+ return null;
+}
+SnapshotInfo snapshotInfo =
+getSnapshotInfo(ozoneManager,
metadataManager.getSnapshotChainManager(), fromSnapshotId);
+OmSnapshotLocalData fromSnapshot = getSnapshotLocalData(snapshotInfo);
+// Get the version that the child was built from
+OmSnapshotLocalData.VersionMeta childVersionMeta =
child.getVersionSstFileInfos().get(child.getVersion());
+int versionUsedWhenBuildingChild =
childVersionMeta.getPreviousSnapshotVersion();
Review Comment:
Yeah this looks is simpler but think it's the same but just getting
calculated in loop now and doesn't need extra handlimg outside the loop.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
