laskoviymishka opened a new issue, #1178:
URL: https://github.com/apache/iceberg-go/issues/1178
Bring iceberg-go to parity with Java and Python on REST server-side scan
planning. The endpoints were added to the Iceberg REST OpenAPI spec by
apache/iceberg#9695 (merged 2024-09-10) and shipped in Java through 2025
(apache/iceberg#13004, #13400, #14480, #14660, #14822, #15184, #15863, #16024,
#16197).
Today iceberg-go has zero support: no types, no client methods, no scanner
wiring, no capability discovery. `(*Scan).PlanFiles` reads manifests locally
regardless of what the catalog can offer.
## What this unlocks
When this lands, Go consumers of an Iceberg REST catalog will be able to:
- Scan tables of any size without downloading manifests locally
- Run stateless from Lambda / Cloud Run / short-lived k8s pods (no manifest
cache)
- Enforce catalog-side governance (RLS, masking, time-travel restrictions)
on Go clients
- Receive per-scan vended `StorageCredentials` with plan-scoped TTLs
- Do cheap incremental scans for CDC consumers via `start-snapshot-id` /
`end-snapshot-id`
- Drive the iceberg-go CLI against production-scale tables (`iceberg files`,
`partition-stats`, `clean-orphan-files`, `expire-snapshots`, `compact`)
## Scope of this epic
This is a client-only effort. Non-goals:
- Implementing the server side of the spec (the REST catalog server lives in
Java).
- A distributed worker runtime. We expose `FetchScanTasks` so external
engines can fan out, but we don't build the engine.
- Reworking the local planning path. Default mode stays `local`; remote
planning is strictly opt-in, no breaking change.
## Correctness contract
The acceptance test for the whole effort is the **local-vs-remote parity
test**: the same scan (same filter, same table, same snapshot) under
`scan-planning-mode=local` and `scan-planning-mode=remote` must produce the
same `FileScanTask` set (paths, ranges, deletes, residuals). If it doesn't, the
expression JSON codec is wrong and silent result corruption — not a loud error
— is the failure mode. That codec is the highest-risk piece of the project.
## Proposed decomposition
Decomposable across PRs. The proposed breakdown below is **eight phases plus
a parked follow-up**, sequenced so each lands behind a stable interface and
ships independently. This is a starting point for the implementer to refine —
please file sub-issues per phase as you pick up the work, adjusting scope and
dependencies as the design firms up. The critical path is 0 → 1 → 2 → 3 → 4 → 5
→ 6 → 7; Phase 2 (expression codec) can be developed in parallel with Phase 1.
- [ ] **Phase 0 — Capability discovery.** Extend `configResponse` in
`catalog/rest` to decode `endpoints []string` from `GET /v1/config`; define
endpoint constants matching the spec strings (`POST
/v1/{prefix}/namespaces/{ns}/tables/{t}/plan` etc.); expose a typed capability
check on `*rest.Catalog`. Pure additive, no behavior change. Good first issue.
This is the discovery surface Phase 5's `auto` mode reads.
- [ ] **Phase 1 — Wire types.** Go types for the four planning payloads:
`PlanTableScanRequest` (snapshot-id XOR start/end-snapshot-id, filter, select,
case-sensitivity, use-snapshot-schema, metrics opts), `PlanTableScanResponse`
(discriminated on `status`: completed/submitted), `FetchPlanningResultResponse`
(completed/working/submitted/cancelled/failed),
`FetchScanTasksRequest`/`Response`, `PlanStatus` enum, `PlanTask` opaque token,
`CompletedPlanningResult` with `FileScanTask`s + optional `StorageCredentials`.
Round-trip + golden JSON tests against fixtures from
`rest-catalog-open-api.yaml`. XOR validation is the only logic.
- [ ] **Phase 2 — Expression JSON codec.** JSON codec for
`BooleanExpression` matching the REST spec: `and`/`or`/`not`, all
unary/binary/set predicates, column-name + nested-field-ID references, literal
encoding for every primitive type (decimal w/ scale, timestamp vs timestamptz,
UUID, fixed/var binary). Respect `case-sensitivity`. **Highest-risk phase** —
cross-check against Java's `ExpressionParser` byte-for-byte. Get this wrong and
filters silently differ between local and remote scans. No dependency on Phase
1, can run in parallel.
- [ ] **Phase 3 — REST client methods + poller.** `PlanTableScan`,
`FetchPlanningResult`, `CancelPlanning`, `FetchScanTasks` on `*rest.Catalog`,
wired via existing `doPost`/`doGet`. A `WaitForPlan` helper polling with
jittered exponential backoff + configurable timeout (mirror Java
apache/iceberg#15863). Error mapping: 404-during-poll → sentinel
`ErrPlanExpired` so callers can re-plan (apache/iceberg#16024, #16197); 400/403
→ typed errors; 503 → retry; ctx cancel → call `CancelPlanning` to free server
resources, then return `context.Canceled`. Cancellation-on-cancel is the
behavior that must match Java exactly — leaking plans on the server is
vendor-visible.
- [ ] **Phase 4 — `ScanPlanner` interface + Table plumbing.** A
`ScanPlanner` seam in `table/` so non-catalog code and tests can supply a
planner without importing the REST package. `PlanResult` wraps inline
`FileScanTask`s + opaque `PlanTask` tokens + plan-id (for cancel) + optional
plan-scoped `StorageCredentials`. Thread the planner through `Table` (parallel
to `CatalogIO`), wired when `rest.Catalog` loads the table; non-REST catalogs
leave it nil. Public-API design moment — worth a design discussion in the PR.
Open question: planner on `Table` vs handed into `Scan` at construction
(recommend on `Table` so existing `t.Scan(...)` call sites don't change).
- [ ] **Phase 5 — Scanner delegation.** Add scan option `scan-planning-mode`
with `local` (default), `remote` (require planner+capability, fail loud if
absent), `auto` (remote if available, else local). Branch in
`(*Scan).PlanFiles`; on remote, call `planner.PlanTableScan`, follow the
sync/async branch via `WaitForPlan`, materialize `PlanTask` tokens via
`FetchScanTasks` in batches, use plan-scoped `StorageCredentials` for
subsequent FileIO. **The value-delivery moment** and where the correctness
parity test lives. Small in code, huge in surface area.
- [ ] **Phase 6 — Fake server + integration tests.** In-process fake at
`catalog/rest/internal/planfake/` implementing the four endpoints over an
in-memory table, running the same local planning code server-side (so parity
tests are meaningful); supports sync/async/fanout/cancel via test switches.
Plus an integration suite gated by `RUN_INTEGRATION_TESTS=1` against the Java
`iceberg-rest-fixture` Docker image for interop confidence. Can be developed in
parallel with Phases 4/5.
- [ ] **Phase 7 — Hardening + docs.** Retries with jittered backoff,
OpenTelemetry spans + metrics (plan requests, expirations, fallback-to-local),
README "Plan Scan" row flipped to "Local + Remote", new "REST Scan Planning"
section with capability matrix + config snippet, migration note (opt-in, no
breaking change). Update v2 (#829) and v3 (#589) tracking issues. Closes the
epic. Spin out any polish item that deserves its own follow-up rather than
bundling.
- [ ] **Phase 8 (parked) — Distributed worker codec.** Public wire codec
(`MarshalJSON`/`UnmarshalJSON` or protobuf) for `PlanTask` + `FileScanTask` so
engines embedding iceberg-go can fan plan tokens out to workers. Only valuable
to query engines that distribute plan-tasks; standalone Go consumers call
`FetchScanTasks` inline (covered by Phase 5). Coordinate with
apache/iceberg-go#1075 (`FileScanTask` codec for distributed compaction) to
avoid duplicate work. Don't ship until an in-tree consumer needs it.
## Open questions to settle before Phase 1
1. **Scan option name.** Java settled on `RequiresRemoteScanPlanning` →
`SupportsDistributedScanPlanning`. Pick a Go-idiomatic toggle name; match Spark
wiring where sensible.
2. **Where `ScanPlanner` lives.** `table/` keeps it cohesive with `Scan` and
lets tests inject a fake; `catalog/` puts it next to the catalog interfaces.
Leaning `table/`.
3. **Default mode.** Start at `local` for back-compat; revisit `auto` once
interop is proven against the Java fixture.
4. **Snapshot/schema semantics.** The spec lets clients pin
`use-snapshot-schema`; make sure local-vs-remote parity covers schema-evolution
edge cases.
## References
- Spec: apache/iceberg#9695 (merged 2024-09-10)
- Java umbrella + follow-ups: apache/iceberg#11180, #13004, #13400, #14480,
#14660, #14822, #15184, #15863, #16024, #16197
- Related iceberg-go work: #1075 (FileScanTask wire codec), #829 (v2
tracking), #589 (v3 tracking)
- Overview:
https://medium.com/data-engineering-with-dremio/iceberg-rest-catalog-overview-7-scan-planning-4b4bf2a46e4d
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]