mahsoodebrahim opened a new pull request, #18696:
URL: https://github.com/apache/hudi/pull/18696
Adds a Spark SQL stored procedure that performs a point-in-time table
restore to any instant on the active timeline, with optional post-restore
file-existence audit. Unlike rollback_to_savepoint, no savepoint is required at
the target instant.
Centralizes the MDT pre-check that was inlined in restoreToSavepoint into a
new BaseHoodieWriteClient.shouldDeleteMdtBeforeRestore helper, and extends
restoreToInstant to invoke it. The helper also catches the
penultimate-compaction case (target at or before the second-most-recent MDT
compaction) which the previous restoreToSavepoint inline check missed.
IO/permission failures now surface as HoodieException instead of being silently
swallowed.
The audit returns a tri-state result (PASSED / FAILED / INCONCLUSIVE) so
transient cloud-storage timeouts are distinguishable from real audit failures;
an audit_only mode lets users re-audit a previously completed restore by
passing its restore_instant_time.
### Describe the issue this Pull Request addresses
OSS Hudi today has no CLI or SQL surface for arbitrary point-in-time table
restore. The existing options are:
- **`rollback_to_savepoint`** Spark SQL procedure / `savepoint rollback` CLI
command — restores to a savepoint, but requires that a savepoint already exist
at the target instant.
- **`rollback_to_instant_time`** Spark SQL procedure — rolls back a single
commit, not a full point-in-time restore.
- **`hudi-cli`'s `RestoresCommand`** — read-only inspection (`show restores`
/ `show restore`) of past restore operations.
- **`SparkRDDWriteClient.restoreToInstant(...)`** Java API — exists but has
no user-facing surface; users must write custom Java/Scala code to invoke it.
Restoring to an arbitrary commit on the active timeline therefore requires
either creating a savepoint first (often impossible — the issue is usually
discovered after the fact) or writing custom code against the write-client API.
This PR adds a `restore_to_instant` Spark SQL stored procedure so users can
perform point-in-time restores directly from SQL: `CALL
restore_to_instant(table => '...', instant_time => '...')`. It also includes an
optional post-restore audit that verifies all rolled-back files are absent from
storage — useful as a confidence check after restores against object stores
where eventual consistency could mask problems.
Along the way, this PR also fixes a latent correctness bug in
`BaseHoodieWriteClient.restoreToSavepoint`: the existing inline MDT pre-check
only triggered an MDT delete when the savepoint was at or before the *oldest*
completed MDT compaction, missing the case where the target falls between the
penultimate and oldest compactions — which would leave the MDT inconsistent
during `finishRestore`. Both `restoreToSavepoint` and the new
`restoreToInstant` flow now share a single pre-check helper that handles this
correctly.
---
### Summary and Changelog
**User-facing summary:** Adds the `restore_to_instant` Spark SQL stored
procedure for point-in-time table restore on any active-timeline instant, with
an optional audit mode for post-restore file verification.
**Detailed changelog:**
1. **New procedure: `restore_to_instant`**
(`hudi-spark-datasource/hudi-spark/.../RestoreToInstantProcedure.scala`)
- Parameters: `table` / `path`, `instant_time`, `enable_metadata`,
`rollback_parallelism`, `enable_consistency_guard`, `audit_post_restore`,
`audit_only`, `restore_instant_time`.
- Output columns: `restore_result` (Boolean), `start_restore_time`
(String — the restore operation's own timeline timestamp),
`time_taken_in_millis` (Long), `instants_rolled_back` (Long), `audit_result`
(String — see below).
- Three modes selected via parameter cross-validation:
- **Restore only** (default): performs the restore, returns metadata.
- **Restore + audit** (`audit_post_restore=true`): performs the
restore, then verifies all rolled-back files are absent from storage.
- **Audit only** (`audit_only=true`): skips the restore and audits a
previously completed restore identified by `restore_instant_time`. Useful for
re-running an audit days after the original restore — no need to remember the
target commit.
- The audit returns a tri-state `audit_result`: `PASSED` (all
expected-deleted files confirmed absent), `FAILED` (at least one file is still
present), or `INCONCLUSIVE` (no `Present` outcome but at least one file
existence check threw an `IOException` — re-run with `audit_only=true` to
retry). This avoids a pattern where a transient cloud-storage timeout is
indistinguishable from a real audit failure.
2. **Centralized MDT pre-check helper**
(`hudi-client/hudi-client-common/.../BaseHoodieWriteClient.java`)
- New `protected boolean shouldDeleteMdtBeforeRestore(String
targetInstant)` method. Returns `true` when restoring to `targetInstant` would
leave the MDT inconsistent — specifically when the target is at or before the
MDT's penultimate completed compaction (≥2 compactions present), at or before
the oldest completed compaction, or before the MDT timeline start.
- `restoreToSavepoint` refactored to call the helper instead of inlining
the check. **Behavior change:** the previous inline check only handled the
oldest-compaction and timeline-start cases, and silently swallowed every
`Exception` (including permission-denied / IO failures) with the comment
"Metadata directory does not exist." The new helper now also catches the
penultimate-compaction case (a correctness fix), and IO/permission failures
surface as `HoodieException` instead of being swallowed. Missing-MDT still
returns false silently as before.
- `restoreToInstant` extended to invoke the helper before scheduling the
restore plan, so callers reaching restore via `restoreToInstant` directly
(including the new procedure) get the same MDT-integrity guarantee. The check
is gated on `initialMetadataTableIfNecessary` so callers that have explicitly
opted out of MDT integration are not affected.
3. **Procedure registration**
(`hudi-spark-datasource/hudi-spark/.../HoodieProcedures.scala`): one-line
addition.
4. **Tests added:**
- `TestRestoreProcedure.scala` (10 tests): basic CoW, basic MoR, `path`
parameter, `audit_post_restore`, `audit_only` mode, `audit_only` with
non-existent restore instant, plus 4 cross-validation guards (`audit_only=true`
requires `restore_instant_time`; `audit_only=false` rejects
`restore_instant_time`; `audit_only=true` rejects `instant_time`;
`audit_only=false` requires `instant_time`).
- `TestSavepointRestoreCopyOnWrite.java` extended (3 tests):
`testRestoreToInstantSkipsMdtCheckWhenMetadataDisabled` (guard verification),
`testRestoreToInstantDeletesMdtWhenTargetIsBeforePenultimateCompaction`
(penultimate-trigger coverage), `testRestoreToSavepointStillWorksAfterRefactor`
(regression for the existing `restoreToSavepoint` path).
---
### Impact
**Public API additions:**
- New stored procedure `restore_to_instant` (callable from Spark SQL).
- New protected method
`BaseHoodieWriteClient.shouldDeleteMdtBeforeRestore(String)` — visible to
subclasses of `BaseHoodieWriteClient` only.
**Behavior changes (user-visible):**
- `restoreToSavepoint` is now slightly stricter: it deletes the MDT
pre-emptively when the savepoint target is at or before the **penultimate**
completed MDT compaction, not just the oldest. This is a correctness fix — the
previous behavior left the MDT inconsistent during `finishRestore` in that
range.
- `restoreToSavepoint` now propagates `HoodieException` if the MDT pre-check
fails with an `IOException` (e.g. permission denied, network failure).
Previously, every `Exception` from the MDT inspection was silently swallowed
under the assumption that the MDT directory was missing. Callers that
previously observed the silent swallow on broken-MDT scenarios will now see a
clear failure.
**Performance:** No measurable impact on the restore path. The helper makes
one storage `exists` check and at most one `HoodieTableMetaClient` build for
the MDT path, mirroring what the inline check already did.
---
### Risk Level
**Medium.**
The risk surface is the change to `BaseHoodieWriteClient.restoreToSavepoint`
semantics: a long-standing code path used across all engines now (a) triggers
an extra MDT delete in the penultimate-compaction edge case and (b) propagates
IO failures it previously swallowed. Any caller that depended on the broader
silent-swallow will see new exceptions; any caller that restored to a savepoint
between the oldest and penultimate MDT compactions will now incur an MDT
rebuild. Both changes are intentional correctness fixes, but they are visible
to existing users of `restoreToSavepoint`.
**Verification performed:**
- `mvn compile` clean for both `hudi-client/hudi-client-common` and
`hudi-spark-datasource/hudi-spark`, including all scalastyle import-order
checks.
- `TestRestoreProcedure`: 10 / 10 tests passed (1 m 3 s).
- Procedure regression suite (`TestRestoreProcedure` +
`TestSavepointsProcedure` + `TestRunRollbackInflightTableServiceProcedure`): 19
/ 19 tests passed across 6 suites (1 m 55 s) — confirms the
`restoreToSavepoint` refactor preserves all existing savepoint-procedure
behavior.
- New write-client integration tests in
`TestSavepointRestoreCopyOnWrite.java` exercise the helper guard, the
penultimate-compaction trigger, and a regression case for `restoreToSavepoint`.
CI will exercise these directly.
---
### Documentation Update
The Hudi website's stored-procedures page needs a new entry for
`restore_to_instant` documenting its parameters, output columns, and the three
modes (restore-only / restore-with-audit / audit-only), including the
`audit_result` tri-state semantics. Will follow up with a website PR per the
[instruction](https://hudi.apache.org/contribute/developer-setup#website) once
this PR's API surface is approved.
No new configs added; no existing config defaults changed.
---
### Contributor's checklist
- [x] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [x] Enough context is provided in the sections above
- [x] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]