XuQianJin-Stars opened a new pull request, #3286:
URL: https://github.com/apache/fluss/pull/3286
TieringSourceEnumerator now acquires a KV snapshot lease for all
TieringSnapshotSplits before they are assigned to readers, and releases the
lease when the table finishes or fails tiering, or when a reader failover
returns the splits. A best-effort `dropLease` is also performed on enumerator
close. This prevents the Fluss server from cleaning up snapshots that the
tiering job still depends on.
One lease id per tiering job (UUID-based) is reused across tables and
persisted into `TieringSourceEnumeratorState` so that it survives enumerator
restore instead of leaking orphan leases. The lease uses a fixed 1-day duration
that is implicitly renewed by every `acquireSnapshots` call, and
`UnsupportedVersionException` from older Fluss servers is downgraded to a
warning to keep backward compatibility.
### Purpose
Linked issue: close #2898
Before this change, the tiering job only read Fluss KV snapshots without
holding any lease on them. A long-running tiering job could therefore race with
the server-side snapshot GC: the server may clean up a snapshot that is still
being / about to be consumed by the tiering `SourceReader`, causing tiering
failures or data loss on the lake side.
This PR makes `TieringSource` hold a KV snapshot lease for the full
lifecycle of each snapshot split it hands out, so that the Fluss server will
not reclaim those snapshots while tiering is in progress.
### Brief change log
- `TieringSourceEnumerator`
- Generate one `kvSnapshotLeaseId` per tiering job (UUID-based) and reuse
it across all tables.
- Before assigning any `TieringSnapshotSplit` to a reader, call
`acquireSnapshots(leaseId, snapshots, 1 day)` on the admin / gateway client to
acquire a lease covering all snapshot splits of the table.
- Track in-flight leased snapshots per table; release the lease
(`releaseSnapshots`) when the table finishes tiering, fails tiering, or when a
reader failover returns the splits back to the enumerator.
- On enumerator `close()`, best-effort `dropLease(leaseId)` to release
everything still held by this job.
- Downgrade `UnsupportedVersionException` (old server) to a warning log,
so the tiering job keeps working against older Fluss servers without the lease
API.
- `TieringSourceEnumeratorState` + `TieringSourceEnumeratorStateSerializer`
- Persist `kvSnapshotLeaseId` into the enumerator checkpoint state (new
serializer version, backward compatible with the previous version).
- `TieringSource`
- On `restoreEnumerator`, reuse the persisted `kvSnapshotLeaseId` from the
checkpoint so the recovered enumerator does not generate a new UUID and leak
the previous lease.
### Tests
- `TieringSourceEnumeratorTest`
- New cases covering: lease is acquired before snapshot splits are
assigned; lease is released on table finish / fail / reader failover;
`dropLease` is invoked on enumerator close; enumerator works gracefully when
the server returns `UnsupportedVersionException`.
- `TieringSourceEnumeratorStateSerializerTest`
- Round-trip tests for the new `kvSnapshotLeaseId` field, plus a
backward-compatibility case that deserializes a state written by the previous
serializer version.
- `mvn clean verify` passes locally for the affected modules.
### API and Format
- No public user-facing API change.
- `TieringSourceEnumeratorState` checkpoint format is extended with a new
`kvSnapshotLeaseId` field. The serializer version is bumped and older
checkpoints remain readable (the field defaults to a freshly generated UUID on
restore from old state).
- No storage / wire format change on the Fluss server side; this PR only
consumes the existing `acquireSnapshots` / `releaseSnapshots` / `dropLease`
admin APIs.
### Documentation
- No new user-facing feature or configuration option is introduced, so no
documentation update is required.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]