Copilot commented on code in PR #9985:
URL: https://github.com/apache/ozone/pull/9985#discussion_r3060825082
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -549,6 +552,132 @@ public SnapshotDiffResponse getSnapshotDiffReport(
}
}
+ public SnapshotDiffResponse getSnapshotDiffReport(
+ final String volumeName,
+ final String bucketName,
+ final String fromSnapshotName,
+ final String toSnapshotName,
+ final String index,
+ final int pageSize
+ ) throws IOException {
+
+ SnapshotInfo fsInfo = getSnapshotInfo(ozoneManager,
+ volumeName, bucketName, fromSnapshotName);
+ SnapshotInfo tsInfo = getSnapshotInfo(ozoneManager,
+ volumeName, bucketName, toSnapshotName);
+
+ String snapDiffJobKey = generateSnapDiffJobKey.apply(fsInfo, tsInfo);
+ SnapshotDiffJob snapDiffJob = snapDiffJobTable.get(snapDiffJobKey);
+ OFSPath snapshotRoot = getSnapshotRootPath(volumeName, bucketName);
+
+ if (snapDiffJob == null) {
+ return new SnapshotDiffResponse(
+ new SnapshotDiffReportOzone(snapshotRoot.toString(), volumeName,
bucketName,
+ fromSnapshotName, toSnapshotName,
+ new ArrayList<>(), null), NOT_FOUND, defaultWaitTime);
+ }
+
+ switch (snapDiffJob.getStatus()) {
+ case QUEUED:
+ case REJECTED:
+ return new SnapshotDiffResponse(
+ new SnapshotDiffReportOzone(snapshotRoot.toString(), volumeName,
+ bucketName, fromSnapshotName, toSnapshotName, new ArrayList<>(),
+ null),
+ snapDiffJob.getStatus(), defaultWaitTime, true);
+ case IN_PROGRESS:
+ SnapshotDiffResponse response = new SnapshotDiffResponse(
+ new SnapshotDiffReportOzone(snapshotRoot.toString(), volumeName,
bucketName,
+ fromSnapshotName, toSnapshotName,
+ new ArrayList<>(), null), IN_PROGRESS, defaultWaitTime, true);
+ response.setSubStatus(snapDiffJob.getSubStatus());
+ response.setProgressPercent(snapDiffJob.getKeysProcessedPct());
+ return response;
+ case FAILED:
+ return new SnapshotDiffResponse(
+ new SnapshotDiffReportOzone(snapshotRoot.toString(), volumeName,
+ bucketName, fromSnapshotName, toSnapshotName, new ArrayList<>(),
+ null), FAILED,
+ // waitTime is equal to clean up internal. After that job will be
+ // removed and client can retry.
+ ozoneManager.getOmSnapshotManager().getDiffCleanupServiceInterval(),
+ snapDiffJob.getReason(), true);
+ case DONE:
+ SnapshotDiffReportOzone report = createPageResponse(snapDiffJob,
+ volumeName, bucketName, fromSnapshotName, toSnapshotName, index,
+ pageSize);
+ return new SnapshotDiffResponse(report, DONE, 0L, true);
+ case CANCELLED:
+ return new SnapshotDiffResponse(
+ new SnapshotDiffReportOzone(snapshotRoot.toString(), volumeName,
+ bucketName, fromSnapshotName, toSnapshotName, new ArrayList<>(),
+ null),
+ CANCELLED, 0L, true);
+ default:
+ throw new IllegalStateException("Unknown snapshot job status: " +
+ snapDiffJob.getStatus());
+ }
+ }
+
+ public synchronized SubmitSnapshotDiffResponse submitSnapshotDiff(
+ final String volumeName,
+ final String bucketName,
+ final String fromSnapshotName,
+ final String toSnapshotName,
+ final boolean forceFullDiff,
+ final boolean disableNativeDiff
+ ) throws IOException {
+ SnapshotInfo fsInfo = getSnapshotInfo(ozoneManager,
+ volumeName, bucketName, fromSnapshotName);
+ SnapshotInfo tsInfo = getSnapshotInfo(ozoneManager,
+ volumeName, bucketName, toSnapshotName);
+
+ String snapDiffJobKey = generateSnapDiffJobKey.apply(fsInfo, tsInfo);
+
+ SnapshotDiffJob snapDiffJob = getSnapDiffReportStatus(snapDiffJobKey,
+ volumeName, bucketName, fromSnapshotName, toSnapshotName,
+ forceFullDiff, disableNativeDiff);
+
+ JobStatus prevStatus = snapDiffJob.getStatus();
+ String prevReason = snapDiffJob.getReason();
+
+ switch (prevStatus) {
+ case QUEUED:
+ case FAILED:
+ case REJECTED:
+ case CANCELLED:
+ LOG.info("Submitting snapshot diff generation request for" +
+ " volume: {}, bucket: {}, fromSnapshot: {} and toSnapshot: {}",
+ volumeName, bucketName, fromSnapshotName, toSnapshotName);
+ try {
+ updateJobStatus(snapDiffJobKey, prevStatus, IN_PROGRESS);
+ snapDiffExecutor.execute(() ->
generateSnapshotDiffReport(snapDiffJobKey, snapDiffJob.getJobId(),
+ volumeName, bucketName, fromSnapshotName, toSnapshotName,
+ forceFullDiff, disableNativeDiff));
+ if (prevStatus == QUEUED) {
+ return new SubmitSnapshotDiffResponse(defaultWaitTime, null, null);
+ } else {
+ return new SubmitSnapshotDiffResponse(defaultWaitTime, prevStatus,
prevReason);
+ }
Review Comment:
When resubmitting after a FAILED/REJECTED/CANCELLED job, this reuses the
existing SnapshotDiffJob (including its jobId) and starts a new generation run
with the same jobId. Since diff report keys and temp paths are jobId-based,
stale report entries/temp state from the previous attempt can be mixed into the
new run (eg, leftover keys beyond the new total), producing incorrect reports.
Consider creating a fresh SnapshotDiffJob with a new jobId (and resetting
totals/subStatus/progress/reason) on these resubmits, or explicitly deleting
all prior report entries for the previous jobId before starting the new run.
```suggestion
LOG.info("Submitting snapshot diff generation request for" +
" volume: {}, bucket: {}, fromSnapshot: {} and toSnapshot: {}",
volumeName, bucketName, fromSnapshotName, toSnapshotName);
try {
updateJobStatus(snapDiffJobKey, prevStatus, IN_PROGRESS);
snapDiffExecutor.execute(() -> generateSnapshotDiffReport(
snapDiffJobKey, snapDiffJob.getJobId(),
volumeName, bucketName, fromSnapshotName, toSnapshotName,
forceFullDiff, disableNativeDiff));
return new SubmitSnapshotDiffResponse(defaultWaitTime, null, null);
} catch (RejectedExecutionException exception) {
updateJobStatus(snapDiffJobKey, IN_PROGRESS, REJECTED);
LOG.warn("Exceeded the snapDiff parallel requests processing " +
"limit. Rejecting the snapDiff job: {}.", snapDiffJobKey);
return new SubmitSnapshotDiffResponse("Snapshot diff job was
rejected because parallel processing of " +
"snap diff jobs exceeded the limit. Please try again in " +
defaultWaitTime + " ms.");
} catch (Exception exception) {
updateJobStatusToFailed(snapDiffJobKey, exception.getMessage());
LOG.error("Snapshot diff job: {} failed.", snapDiffJobKey,
exception);
return new SubmitSnapshotDiffResponse("Snapshot diff job submission
failed. Reason: " + exception.getMessage());
}
case FAILED:
case REJECTED:
case CANCELLED:
LOG.info("Resubmitting snapshot diff generation request for" +
" volume: {}, bucket: {}, fromSnapshot: {} and toSnapshot: {}",
volumeName, bucketName, fromSnapshotName, toSnapshotName);
try {
updateJobStatus(snapDiffJobKey, prevStatus, IN_PROGRESS);
final String resubmittedJobId = UUID.randomUUID().toString();
snapDiffExecutor.execute(() -> generateSnapshotDiffReport(
snapDiffJobKey, resubmittedJobId,
volumeName, bucketName, fromSnapshotName, toSnapshotName,
forceFullDiff, disableNativeDiff));
return new SubmitSnapshotDiffResponse(defaultWaitTime, prevStatus,
prevReason);
```
##########
hadoop-ozone/cli-shell/src/main/java/org/apache/hadoop/ozone/shell/snapshot/SnapshotDiffHandler.java:
##########
@@ -82,6 +82,12 @@ public class SnapshotDiffHandler extends Handler {
defaultValue = "false")
private boolean cancel;
+ @CommandLine.Option(names = {"-r", "--get-report"},
+ description = "Get the snapshot diff report is available; " +
Review Comment:
The --get-report option description has a grammar issue ("Get the snapshot
diff report is available"). Consider changing it to "Get the snapshot diff
report if available" or similar to avoid confusing CLI help output.
```suggestion
description = "Get the snapshot diff report if available; " +
```
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -855,17 +857,56 @@ public SnapshotDiffResponse getSnapshotDiffReport(final
String volume,
fromSnapshot, toSnapshot, index, pageSize, forceFullDiff,
disableNativeDiff);
- // Check again to make sure that from and to snapshots are still active and
- // were not deleted in between response generation.
- // Ideally, snapshot diff and snapshot deletion should take an explicit
lock
- // to achieve the synchronization, but it would be complex and expensive.
- // To avoid the complexity, we just check that snapshots are active
- // before returning the response. It is like an optimistic lock to achieve
- // similar behaviour and make sure client gets consistent response.
validateSnapshotsExistAndActive(volume, bucket, fromSnapshot, toSnapshot);
return snapshotDiffReport;
}
+ public SnapshotDiffResponse getSnapshotDiffResponse(final String volume,
+ final String bucket,
+ final String
fromSnapshot,
+ final String toSnapshot,
+ final String token,
Review Comment:
The new report-only path (getSnapshotDiffResponse) no longer validates that
both snapshots exist and are ACTIVE (the deprecated getSnapshotDiffReport() did
optimistic validation before and after generating the response). Without this,
a report could be returned for deleted/inactive snapshots or while snapshots
are being deleted concurrently. Consider re-adding
validateSnapshotsExistAndActive(volume, bucket, fromSnapshot, toSnapshot)
before and after calling snapshotDiffManager.getSnapshotDiffReport().
##########
hadoop-ozone/cli-shell/src/main/java/org/apache/hadoop/ozone/shell/snapshot/SnapshotDiffHandler.java:
##########
@@ -109,15 +115,25 @@ protected void execute(OzoneClient client, OzoneAddress
address)
if (cancel) {
cancelSnapshotDiff(client.getObjectStore(), volumeName, bucketName);
- } else {
+ } else if (getReport) {
getSnapshotDiff(client.getObjectStore(), volumeName, bucketName);
+ } else {
+ submitSnapshotDiff(client.getObjectStore(), volumeName, bucketName);
Review Comment:
When --get-report is used, this code path ignores
forceFullDiff/disableNativeDiff; and when submitting (no --get-report),
token/page-size are ignored. Since these options are still accepted on the CLI,
it’s easy for users to think they’re taking effect when they aren’t. Consider
validating and rejecting incompatible option combinations (or emitting an
explicit warning).
##########
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocol/OzoneManagerProtocol.java:
##########
@@ -793,8 +794,11 @@ default ListSnapshotResponse listSnapshot(
* @param pageSize maximum entries returned to the report.
* @param forceFullDiff request to force full diff, skipping DAG optimization
* @return the difference report between two snapshots
+ * @deprecated Use {@link #snapshotDiff(String, String, String, String,
String, int, boolean, boolean, Boolean)}
+ * instead
Review Comment:
Javadoc `@deprecated` link references a snapshotDiff(...) signature that
does not exist (extra boolean/Boolean params). Update the `@link` to point to
the actual replacement APIs (eg, submitSnapshotDiff(...) for submission and
snapshotDiff(..., token, pageSize) for report retrieval) so Javadoc generation
and published docs are correct.
```suggestion
* @deprecated Use {@link #submitSnapshotDiff(String, String, String,
String, boolean, boolean)}
* to submit a snapshot diff job and
* {@link #snapshotDiff(String, String, String, String, String, int)} to
* retrieve the diff report instead.
```
##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/protocol/ClientProtocol.java:
##########
@@ -1247,8 +1248,11 @@ ListSnapshotResponse listSnapshot(
* @param pageSize maximum entries returned to the report.
* @param forceFullDiff request to force full diff, skipping DAG optimization
* @return the difference report between two snapshots
+ * @deprecated Use {@link #snapshotDiff(String, String, String, String,
String, int, boolean, boolean, boolean)}
+ * instead to submit a new snapshot diff job or just return the existing job
result.
Review Comment:
The `@deprecated` Javadoc link points to a non-existent snapshotDiff(...)
overload (extra boolean parameter). Update the `@link` to the actual
replacement APIs (submitSnapshotDiff for submission and the report-only
snapshotDiff overload) to avoid broken/misleading generated docs.
```suggestion
* @deprecated Use {@link #submitSnapshotDiff(String, String, String,
String, boolean, boolean)}
* to submit a new snapshot diff job, or
* {@link #snapshotDiff(String, String, String, String, String, int)} to
* return an existing job result.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]