40u5 opened a new pull request, #183: URL: https://github.com/apache/spark-connect-go/pull/183
### What changes were proposed in this pull request? Closes #49. Adds the per-session tag management API that complements the `Interrupt` / `InterruptTag` / `InterruptOperation` surface introduced in #182, so that callers can actually tag operations and then cancel by tag: - `SparkSession.AddTag(tag string) error` — attaches a tag to every subsequent operation issued by the session. - `SparkSession.RemoveTag(tag string) error` — removes a previously-added tag. Removing a tag that was never added is a no-op. - `SparkSession.GetTags() []string` — returns the tags currently attached to the session, sorted lexicographically for deterministic output. - `SparkSession.ClearTags()` — drops every tag attached to the session. Plumbing: - The set is stored on the underlying `sparkConnectClientImpl` behind a `sync.RWMutex` so it is safe to mutate from multiple goroutines. - `newExecutePlanRequest` now sets `request.Tags = s.GetTags()`, so the tags ride along on every `ExecutePlan` / `ExecuteCommand` issued after `AddTag`. - The new methods are exposed on the `SparkConnectClient` interface (`spark/client/base/base.go`) and `spark/mocks/mock_executor.go` is updated to satisfy the extended interface. - Validation lives in a new exported `base.ValidateTag` so both the real client and the in-tree mock enforce the Spark Connect contract: tags must be non-empty and must not contain `,`. Invalid input returns `sparkerrors.InvalidArgumentError`. ### Why are the changes needed? Issue #49 asks for `Interrupt()` support on the Go client. #182 surfaces `InterruptAll` / `InterruptTag` / `InterruptOperation`, but `InterruptTag` is only useful if callers can actually attach tags to operations — which today they cannot, because there is no equivalent of PySpark's `SparkSession.addTag` / `removeTag` / `getTags` / `clearTags`. This PR fills that gap. ### Does this PR introduce _any_ user-facing change? Yes — additive only: - `SparkSession` gains `AddTag`, `RemoveTag`, `GetTags`, `ClearTags`. - The `SparkConnectClient` interface gains the same four methods; no known external implementers, and the in-tree `mocks.TestExecutor` is updated in the same commit. - `base.ValidateTag` is exported so callers can reuse the same validation if they want. No backward-incompatible breakages. ### How was this patch tested? Added unit tests in `spark/client/client_test.go`: - `TestAddTagRejectsInvalidInput` — empty tag and tag containing `,` both return `sparkerrors.InvalidArgumentError` and are not stored. - `TestTagsRoundTripAddRemoveClear` — `AddTag` dedupes, `GetTags` returns sorted output, `RemoveTag` works including removing tags that were never added (no-op), `ClearTags` resets the set. - `TestExecutePlanRequestCarriesSessionTags` — wraps the existing mock in a recorder that captures the `ExecutePlanRequest` and verifies that tags configured via `AddTag` end up on `request.Tags`, and that `ClearTags` scrubs them from subsequent requests. And in `spark/sql/sparksession_test.go`: - `TestSparkSessionTagsRoundTripThroughClient` — exercises the same flow through the public `SparkSession` API, including invalid-tag rejection. Locally: - `go build ./...` — clean. - `go vet ./...` — clean. - `gofmt -l ./spark` — no diff. - `go test $(go list ./... | grep -v internal/tests/integration)` — all PASS. The integration suite is excluded because it requires `SPARK_HOME` and is unrelated to this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
