laskoviymishka commented on code in PR #1213: URL: https://github.com/apache/iceberg-go/pull/1213#discussion_r3430320344
########## expression_json.go: ########## @@ -0,0 +1,42 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +// This file is a PROPOSED public API surface for REST scan-planning +// expression JSON (apache/iceberg-go#1178). The bodies are intentionally +// unimplemented; the file exists so the API shape can be reviewed. +// +// The codec lives in the root iceberg package because expression internals +// are defined here and are not exported. Compatibility with Java's +// ExpressionParser is correctness-critical and every encoding must be +// confirmed against checked-in Java golden fixtures (e.g. AlwaysTrue/False +// serialize as the bare JSON booleans `true`/`false`, not objects). + +package iceberg + +// MarshalExpressionJSON serializes a boolean expression to the JSON format +// produced by Java's ExpressionParser, for use as a REST scan-planning filter. +func MarshalExpressionJSON(expr BooleanExpression) ([]byte, error) { Review Comment: The "AlwaysTrue/False serialize as bare `true`/`false`" note matches Java's `ExpressionParser` output, but it diverges from the REST OpenAPI `Expression` schema, where `TrueExpression`/`FalseExpression` are objects (`{"type":"true"}`). A server validating strictly against the spec would reject a bare boolean filter. I'd make this an explicit design decision in the doc comment rather than an incidental example: we emit Java's wire format (bare booleans), servers must accept it, and `UnmarshalExpressionJSON` accepts both the bare form and the `{"type":...}` form. Worth settling here since the golden fixtures get pinned to whichever we pick. ########## catalog/rest/scan_planning.go: ########## @@ -0,0 +1,205 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +// This file is a PROPOSED public API surface for REST server-side scan +// planning (apache/iceberg-go#1178). The bodies are intentionally +// unimplemented; the file exists so the REST surface can be reviewed as Go. +// Endpoint capability discovery (Endpoint, SupportsEndpoint) lands separately +// in the Phase 0 PR and is intentionally not redeclared here. + +package rest + +import ( + "context" + "encoding/json" + "fmt" + "time" + + "github.com/apache/iceberg-go/table" +) + +// Compile-time proof that the REST catalog satisfies the table planner seam. +var _ table.ScanPlanner = (*Catalog)(nil) + +// ErrPlanExpired is returned when polling a plan-id the server no longer knows +// about (HTTP 404 while polling), distinct from a table-not-found 404. +var ErrPlanExpired = fmt.Errorf("%w: scan plan expired", ErrRESTError) + +// --- Capability gating (Open Question 2) ------------------------------------ +// +// A single capability check is too coarse: requiring all four endpoints falls +// back to local against sync-only servers, while requiring only the plan +// endpoint false-positives, because planTableScan can return `submitted` or +// `plan-tasks` that need the poll/fetch endpoints. The split below lets `auto` +// use a sync-only server while reserving the async/fanout path for servers +// that advertise everything. + +// SupportsPlanTableScan reports whether the server advertised the synchronous +// plan endpoint. +func (c *Catalog) SupportsPlanTableScan() bool { + panic("unimplemented: proposed API for #1178") +} + +// SupportsFullRemoteScanPlanning reports whether the server advertised all four +// scan-planning endpoints (plan, fetch-result, cancel, fetch-tasks). +func (c *Catalog) SupportsFullRemoteScanPlanning() bool { + panic("unimplemented: proposed API for #1178") +} + +// --- table.ScanPlanner implementation --------------------------------------- + +// SupportsRemoteScanPlanning reports whether this catalog can complete a remote +// plan end-to-end; backed by the split capability checks above. +func (c *Catalog) SupportsRemoteScanPlanning() bool { + panic("unimplemented: proposed API for #1178") +} + +// PlanFiles plans a scan server-side and returns tasks (and, optionally, a +// plan-scoped FileIO) for the table to read. +func (c *Catalog) PlanFiles(ctx context.Context, req table.ScanPlanningRequest) (table.ScanPlanningResult, error) { + panic("unimplemented: proposed API for #1178") +} + +// --- Low-level client methods ----------------------------------------------- + +// PlanTableScan submits a scan plan. The result is either completed inline, +// submitted (returns a plan-id to poll), or failed. +func (c *Catalog) PlanTableScan(ctx context.Context, ident table.Identifier, req PlanTableScanRequest) (PlanTableScanResponse, error) { + panic("unimplemented: proposed API for #1178") +} + +// FetchPlanningResult polls a previously submitted plan. +func (c *Catalog) FetchPlanningResult(ctx context.Context, ident table.Identifier, planID string) (FetchPlanningResultResponse, error) { + panic("unimplemented: proposed API for #1178") +} + +// CancelPlanning cancels a server-side plan. Callers should cancel on context +// cancellation using a detached context with a short timeout. +func (c *Catalog) CancelPlanning(ctx context.Context, ident table.Identifier, planID string) error { + panic("unimplemented: proposed API for #1178") +} + +// FetchScanTasks fetches the scan tasks for a plan-task handle returned by a +// completed plan. +func (c *Catalog) FetchScanTasks(ctx context.Context, ident table.Identifier, req FetchScanTasksRequest) (FetchScanTasksResponse, error) { + panic("unimplemented: proposed API for #1178") +} + +// WaitForPlan polls a submitted plan to completion using jittered backoff, +// cancelling the server-side plan if the context is cancelled. It returns an +// error if the plan is still submitted after the wait, cancelled, failed, or +// expired. +func (c *Catalog) WaitForPlan(ctx context.Context, ident table.Identifier, planID string, opts WaitForPlanOptions) (CompletedPlanningResult, error) { + panic("unimplemented: proposed API for #1178") +} + +// --- Wire types (sketch) ---------------------------------------------------- +// +// Content-file, delete-file, and residual decoding lands with the scan-task +// decoder PR; these sketch the request/response envelopes so the client +// surface compiles and reads. + +// PlanStatus is the status of a server-side plan. +type PlanStatus string + +const ( + PlanStatusCompleted PlanStatus = "completed" + PlanStatusSubmitted PlanStatus = "submitted" + PlanStatusCancelled PlanStatus = "cancelled" + PlanStatusFailed PlanStatus = "failed" +) + +// PlanningError is the ErrorModel payload carried by a failed planning result. +type PlanningError struct { + Message string `json:"message"` + Type string `json:"type"` + Code int `json:"code"` + Stack []string `json:"stack,omitempty"` +} + +// ScanTasks carries the task payload shared by completed planning responses and +// fetchScanTasks responses. Task/delete payload decoding lands with the +// scan-task decoder PR. +type ScanTasks struct { + PlanTasks []string `json:"plan-tasks,omitempty"` + FileScanTasks json.RawMessage `json:"file-scan-tasks,omitempty"` + DeleteFiles json.RawMessage `json:"delete-files,omitempty"` +} + +// CompletedPlanningResult is the completed arm of the planning-result union. +// PlanID is populated only by the initial planTableScan response's +// CompletedPlanningWithIDResult arm. +type CompletedPlanningResult struct { + Status PlanStatus `json:"status"` + PlanID *string `json:"plan-id,omitempty"` Review Comment: The spec's `CompletedPlanningWithIDResult` makes `plan-id` required for a completed response from `planTableScan` — and without it, `CancelPlanning` can never be called, so we'd leak server-side plan state. `*string` with `omitempty` silently accepts a missing id here. Either split into two response types (completed-from-plan with a required non-pointer `PlanID`, completed-from-fetch without), or validate that a `completed` status from `planTableScan` always carries a non-nil plan-id. ########## catalog/rest/scan_planning.go: ########## @@ -0,0 +1,205 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +// This file is a PROPOSED public API surface for REST server-side scan +// planning (apache/iceberg-go#1178). The bodies are intentionally +// unimplemented; the file exists so the REST surface can be reviewed as Go. +// Endpoint capability discovery (Endpoint, SupportsEndpoint) lands separately +// in the Phase 0 PR and is intentionally not redeclared here. + +package rest + +import ( + "context" + "encoding/json" + "fmt" + "time" + + "github.com/apache/iceberg-go/table" +) + +// Compile-time proof that the REST catalog satisfies the table planner seam. +var _ table.ScanPlanner = (*Catalog)(nil) + +// ErrPlanExpired is returned when polling a plan-id the server no longer knows +// about (HTTP 404 while polling), distinct from a table-not-found 404. +var ErrPlanExpired = fmt.Errorf("%w: scan plan expired", ErrRESTError) + +// --- Capability gating (Open Question 2) ------------------------------------ +// +// A single capability check is too coarse: requiring all four endpoints falls +// back to local against sync-only servers, while requiring only the plan +// endpoint false-positives, because planTableScan can return `submitted` or +// `plan-tasks` that need the poll/fetch endpoints. The split below lets `auto` +// use a sync-only server while reserving the async/fanout path for servers +// that advertise everything. + +// SupportsPlanTableScan reports whether the server advertised the synchronous +// plan endpoint. +func (c *Catalog) SupportsPlanTableScan() bool { + panic("unimplemented: proposed API for #1178") +} + +// SupportsFullRemoteScanPlanning reports whether the server advertised all four +// scan-planning endpoints (plan, fetch-result, cancel, fetch-tasks). +func (c *Catalog) SupportsFullRemoteScanPlanning() bool { + panic("unimplemented: proposed API for #1178") +} + +// --- table.ScanPlanner implementation --------------------------------------- + +// SupportsRemoteScanPlanning reports whether this catalog can complete a remote +// plan end-to-end; backed by the split capability checks above. +func (c *Catalog) SupportsRemoteScanPlanning() bool { + panic("unimplemented: proposed API for #1178") +} + +// PlanFiles plans a scan server-side and returns tasks (and, optionally, a +// plan-scoped FileIO) for the table to read. +func (c *Catalog) PlanFiles(ctx context.Context, req table.ScanPlanningRequest) (table.ScanPlanningResult, error) { + panic("unimplemented: proposed API for #1178") +} + +// --- Low-level client methods ----------------------------------------------- + +// PlanTableScan submits a scan plan. The result is either completed inline, +// submitted (returns a plan-id to poll), or failed. +func (c *Catalog) PlanTableScan(ctx context.Context, ident table.Identifier, req PlanTableScanRequest) (PlanTableScanResponse, error) { + panic("unimplemented: proposed API for #1178") +} + +// FetchPlanningResult polls a previously submitted plan. +func (c *Catalog) FetchPlanningResult(ctx context.Context, ident table.Identifier, planID string) (FetchPlanningResultResponse, error) { + panic("unimplemented: proposed API for #1178") +} + +// CancelPlanning cancels a server-side plan. Callers should cancel on context +// cancellation using a detached context with a short timeout. +func (c *Catalog) CancelPlanning(ctx context.Context, ident table.Identifier, planID string) error { + panic("unimplemented: proposed API for #1178") +} + +// FetchScanTasks fetches the scan tasks for a plan-task handle returned by a +// completed plan. +func (c *Catalog) FetchScanTasks(ctx context.Context, ident table.Identifier, req FetchScanTasksRequest) (FetchScanTasksResponse, error) { + panic("unimplemented: proposed API for #1178") +} + +// WaitForPlan polls a submitted plan to completion using jittered backoff, +// cancelling the server-side plan if the context is cancelled. It returns an +// error if the plan is still submitted after the wait, cancelled, failed, or +// expired. +func (c *Catalog) WaitForPlan(ctx context.Context, ident table.Identifier, planID string, opts WaitForPlanOptions) (CompletedPlanningResult, error) { + panic("unimplemented: proposed API for #1178") +} + +// --- Wire types (sketch) ---------------------------------------------------- +// +// Content-file, delete-file, and residual decoding lands with the scan-task +// decoder PR; these sketch the request/response envelopes so the client +// surface compiles and reads. + +// PlanStatus is the status of a server-side plan. +type PlanStatus string + +const ( + PlanStatusCompleted PlanStatus = "completed" + PlanStatusSubmitted PlanStatus = "submitted" + PlanStatusCancelled PlanStatus = "cancelled" Review Comment: The spec calls `cancelled` invalid as a `planTableScan` response status. Since `PlanStatus` is shared across all the endpoints, I'd add a doc comment on `PlanStatusCancelled` noting it's spec-invalid for `planTableScan`, and have `PlanTableScan`/`WaitForPlan` treat a cancelled status in the initial plan response as an error. ########## catalog/rest/scan_planning.go: ########## @@ -0,0 +1,205 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +// This file is a PROPOSED public API surface for REST server-side scan +// planning (apache/iceberg-go#1178). The bodies are intentionally +// unimplemented; the file exists so the REST surface can be reviewed as Go. +// Endpoint capability discovery (Endpoint, SupportsEndpoint) lands separately +// in the Phase 0 PR and is intentionally not redeclared here. + +package rest + +import ( + "context" + "encoding/json" + "fmt" + "time" + + "github.com/apache/iceberg-go/table" +) + +// Compile-time proof that the REST catalog satisfies the table planner seam. +var _ table.ScanPlanner = (*Catalog)(nil) + +// ErrPlanExpired is returned when polling a plan-id the server no longer knows +// about (HTTP 404 while polling), distinct from a table-not-found 404. +var ErrPlanExpired = fmt.Errorf("%w: scan plan expired", ErrRESTError) + +// --- Capability gating (Open Question 2) ------------------------------------ +// +// A single capability check is too coarse: requiring all four endpoints falls +// back to local against sync-only servers, while requiring only the plan +// endpoint false-positives, because planTableScan can return `submitted` or +// `plan-tasks` that need the poll/fetch endpoints. The split below lets `auto` +// use a sync-only server while reserving the async/fanout path for servers +// that advertise everything. + +// SupportsPlanTableScan reports whether the server advertised the synchronous +// plan endpoint. +func (c *Catalog) SupportsPlanTableScan() bool { Review Comment: Every existing `*Catalog` method in rest.go uses the receiver `r`; this file uses `c` throughout. I'd switch all of these to `r` to match — revive/govet flags receiver-name inconsistency, so this may also trip CI lint. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
