This is an automated email from the ASF dual-hosted git repository.
jason810496 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new c767af5a47b Go-SDK: ADRs for bundle packing and coordinator-protocol
runtime (#67153)
c767af5a47b is described below
commit c767af5a47b521bb7689913925e5cc07ceb926da
Author: Jason(Zhe-You) Liu <[email protected]>
AuthorDate: Wed Jun 3 14:07:32 2026 +0800
Go-SDK: ADRs for bundle packing and coordinator-protocol runtime (#67153)
* Add ADRs for bundle packing options and Go tool directive implementation
* Add ADR for dual-mode bundle binary supporting msgpack-over-IPC
coordinator protocol
* Rename ExecutableRuntimeCoordinator to ExecutableCoordinator across
documentation and codebase
* Refactor bundle spec to use self-contained executable with embedded
metadata and source
* CI: Fix the link to other docs
* Reflect self-contain binary spec in ADR 0003
* Resolve cross-compile in packer with two-build introspection
The packer's manifest-population step execs the freshly built binary
with --dump-bundle-spec. On cross-builds (e.g., darwin/arm64 host
packing for linux/amd64) the target binary is not runnable on the
host, so a single go-build pipeline cannot introspect it. The same
issue applies to --executable, where the user may hand the packer a
pre-built cross-target binary that the host cannot exec.
Specify in ADR 0002 that the packer treats introspection and
target-build as separate concerns. The packer attempts to exec the
candidate introspection binary and, on "exec format error", builds a
host-arch sidecar from the positional package, execs that for
--dump-bundle-spec, and appends the resulting footer to the target
artifact (whether that artifact came from the internal go build or
from --executable). The Go build cache amortises the sidecar to a
link step when host arch is already involved, so there is no overhead
when no cross-compile is needed and no host-side runner (Rosetta,
qemu-user-static) is required for the cross case.
If --executable is given without a positional package and the
supplied binary is not host-runnable, the packer errors with a
message asking for the source package so the sidecar can be built.
Add a one-line note to ADR 0001 Option D's cons pointing to ADR
0002 so the trade-off is visible alongside the other options.
* Scope Windows out of v1 for self-contained bundle format
The footer-after-EOF layout assumes signing tools hash the entire
file. That holds on Linux (dm-verity) and macOS (codesign) but not
on Windows: PE Authenticode stores its signature in the certificate
table referenced from the Optional Header, and Microsoft's
EnableCertPaddingCheck hardening (MS13-098) rejects extra bytes
after the signature. Appending the bundle footer in either order
relative to signtool produces a binary that strict-mode verification
rejects, so the current ADR cannot honestly claim Windows support.
Remove the .exe defaults and the Authenticode reference from the
build-pipeline and out-of-scope sections, and add Windows as an
explicit non-goal for v1 with a note that supporting it would
require a different layout (e.g. a PE resource section) tracked
separately. The packer should refuse GOOS=windows builds in v1
with a clear error rather than silently producing an unsignable
artefact.
* Drop "statically-linked" qualifier from native-SDK claim
The footer-after-EOF technique works because the OS loader stops at
the format-defined end of image (ELF section/segment extents, Mach-O
LC_SEGMENT extents), not because the binary is statically linked.
Rust and C++ default to dynamic glibc on Linux and remain single-file
artefacts that take the footer cleanly. Replace "statically-linked
native executable" with "single-file native executable" so the
reasoning matches the property the technique actually depends on,
and fix the parallel claim in the context paragraph above.
* Add binary_sha256 to footer; demote OS code-signing to deployment
hardening
The previous draft of ADR 0004 leaned on OS code-signing (codesign /
Authenticode / dm-verity) for bundle tamper detection. That framing
was honest only for macOS, and even there it conflated authenticity
(signed by a trusted identity) with integrity (file is byte-identical
to what the packer produced). Airflow's threat model treats
bundles_folder as Deployment-Manager-controlled, so the bundle format
only needs to provide integrity against truncation, partial writes,
and naive tampering; authenticity is a deployment-time concern that
Deployment Managers can layer on top with whatever signing flow
matches their platform.
Expand the trailer from 32 to 64 bytes to carry binary_sha256 (32-byte
SHA-256 over bytes [0, source_start), the binary region only — the
hash field is itself in the trailer and cannot cover itself). The
scanner verifies the hash at discovery and caches the result by
(path, inode, mtime, size) so the runtime does not re-hash on every
exec. Mismatch is treated identically to a failed magic check: the
scanner skips the file with a clear error.
Reframe the code-sign rule: the bundle format neither requires nor
prevents OS code-signing. Deployment Managers who want
codesign/rcodesign on macOS or fs-verity/IMA on Linux apply it after
the footer append so the signature covers the trailer; those who do
not (the common Linux deployment) get tamper detection from the
embedded hash alone.
Narrow the Windows non-goal accordingly. The footer-after-EOF layout
runs fine on PE; only Authenticode code-signing is incompatible
because MS13-098 rejects extra bytes past the signature. Windows
becomes a supported bundle target again (.exe default output
restored); Authenticode-signed Windows bundles remain out of scope
for v1 and would require a different layout (PE resource section)
tracked separately.
Note the small verification-to-execve TOCTOU window under "What this
costs"; acceptable for v1 given the threat model.
* Replace "fully general" hedge with the actual go-build-flag rule
The Option A bullet ended with "kept inside the standalone-packer
shape so the SDK does not own a fully general go build wrapper",
which describes what the packer is not without saying what it is.
The real rule is positive: the packer never interprets go build
flags; arbitrary flags pass through verbatim after the -- separator,
which keeps the packer's flag surface small and stable as go build
evolves. State the rule directly.
* Resolve dispatch-matrix overlap between go-plugin and error rows
The matrix marked the AIRFLOW_BUNDLE_MAGIC_COOKIE row as "(default)"
while keeping an "Otherwise -> error" fallthrough row, which made it
unclear which row fires when no cookie is set. The actual rule is
ordered match: rows 1-4 each require a specific positive trigger
(metadata flag, spec-dump flag, comm/logs pair, magic cookie env
var), and row 5 fires only when none of those conditions hold.
Drop the misleading "(default)" annotation, number the rows so the
ordering is explicit, and reword the fallthrough row to make the
precedence visible at a glance.
* Note that panic recovery is pkg/worker behaviour, not coordinator-only
The task-execution diagram shows "(panic recovered -> 'failed')",
which a reviewer could plausibly read as a coordinator-mode
invention. Recovery is actually pkg/worker.Worker.runTask's
existing defer recover() that calls reportStateFailed on panic
(runner.go:295-311); both go-plugin mode and coordinator mode
reuse the same Worker, so the behaviour is identical between the
two modes. Add a paragraph next to the diagram so the property is
read as shared rather than divergent.
* Make ADRs up to date
* Address phani's comment and static check errors
* Rename --bundle-metadata flag as --airflow-metadata
---
go-sdk/adr/0001-bundle-packing-options.md | 312 ++++++++++++++++
...0002-use-go-tool-directive-for-bundle-packer.md | 262 +++++++++++++
.../adr/0003-coordinator-protocol-msgpack-ipc.md | 404 +++++++++++++++++++++
.../adr/0004-self-contained-executable-bundle.md | 377 +++++++++++++++++++
4 files changed, 1355 insertions(+)
diff --git a/go-sdk/adr/0001-bundle-packing-options.md
b/go-sdk/adr/0001-bundle-packing-options.md
new file mode 100644
index 00000000000..36203268430
--- /dev/null
+++ b/go-sdk/adr/0001-bundle-packing-options.md
@@ -0,0 +1,312 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 1. Post-build bundle-packing options for the Go SDK
+
+Date: 2026-04-30
+
+## Status
+
+Accepted as the option register. The packer-mechanism decision is
+recorded in [ADR 0002](0002-use-go-tool-directive-for-bundle-packer.md):
+Option H (Go 1.24 `tool` directive) for delivery, paired with Option A
+(standalone `airflow-go-pack` binary) and Option D (standardised
+`--airflow-metadata` introspection contract — the single metadata flag
+that prints the bundle's `airflow-metadata.yaml` spec as JSON, which
+`airflow-go-pack` reads to populate the manifest). The shipped runtime
+([`bundlev1server.Serve`](../bundle/bundlev1/bundlev1server/server.go))
+routes through a `decideMode` switch with three modes —
+`--airflow-metadata`, `--comm`/`--logs` (coordinator mode), and the
+default go-plugin path.
+
+The container-format assumption running through this ADR — that the
+output is a ZIP archive — is superseded by
+[ADR 0004](0004-self-contained-executable-bundle.md), which embeds the
+source and manifest in a footer appended to the executable. The
+options below still describe valid *packer mechanisms*; only the
+artefact each one writes has changed from a ZIP to a footer-augmented
+executable.
+
+## Context
+
+The executable provider's bundle spec
+([`task-sdk/docs/executable-bundle-spec.rst`](../../task-sdk/docs/executable-bundle-spec.rst))
+defined the deployment artifact, *at the time this ADR was written*, as a
+ZIP archive containing:
+
+1. `airflow-metadata.yaml` declaring `airflow_bundle_metadata_version`, `sdk`
+ (`language`, `version`, `supervisor_schema_version`), `source`
+ (archive-relative path to the DAG source file), `executable`
+ (archive-relative path to the compiled binary), and `dags` (a mapping of
+ `dag_id` to `{tasks: [task_id, ...]}`). The shipped spec replaced the ZIP
+ with a footer-augmented executable (see
+ [ADR 0004](0004-self-contained-executable-bundle.md)): it dropped the
+ `executable` field (the binary *is* the file) and redefined `source` as a
+ display filename rather than an archive-relative path. The manifest keys
+ above are otherwise unchanged.
+2. The primary DAG source file, included verbatim.
+3. The compiled native executable, which speaks the coordinator protocol
+ (`--comm=<addr>` / `--logs=<addr>`).
+
+Bundle authors today produce the executable with a plain `go build`
+(see [`go-sdk/example/bundle/Justfile`](../example/bundle/Justfile)). There is
+no SDK-provided way to produce the conforming ZIP, so each author would need
+to hand-roll one.
+
+The bundle binary already exposes a `--airflow-metadata` flag (defined in
+[`bundle/bundlev1/bundlev1server/server.go`](../bundle/bundlev1/bundlev1server/server.go))
+that prints the `BundleInfo{Name, Version}` returned by the author's
+`BundleProvider.GetBundleVersion()`. It does **not** currently invoke
+`RegisterDags`, so it does not yet enumerate `dag_id` / `task_id` for the
+manifest. This is relevant context: the binary itself is the authoritative
+source of dag/task identity at runtime, and the SDK can extend the
+introspection path cheaply.
+
+The user's initial framing was `go build -toolexec`. `-toolexec` wraps each
+toolchain invocation (compile, asm, link) and does not have visibility into
+the final `-o` output path or a single "build finished" hook, so it is a poor
+fit for producing the final ZIP. The options below cover the mechanisms that
+do fit, plus the `-toolexec` variant for completeness.
+
+A packing mechanism has two sub-decisions:
+
+- **Where the packing logic runs.** In the bundle binary itself
+ (self-pack), in a separate SDK CLI, or in build tooling outside the SDK
+ (Makefile/Justfile snippet).
+- **How dag/task IDs reach the manifest.** Runtime introspection of the
+ built binary (call into `RegisterDags` against an in-memory
+ registry recorder), static AST scan of the source file, or
+ hand-written manifest.
+
+The options below combine those two sub-decisions in different ways.
+
+## Options
+
+### Option A: Standalone SDK packer CLI (`airflow-go-pack`)
+
+A new binary under `go-sdk/cmd/airflow-go-pack` that takes
+already-built inputs and writes the ZIP:
+
+```
+airflow-go-pack \
+ --source ./example/bundle/main.go \
+ --executable ./bin/example-dag-bundle \
+ --output ./bin/example.zip
+```
+
+Manifest population: the packer execs the supplied executable with
+`--airflow-metadata` and reads the JSON from stdout to fill `sdk`
+(`language`, `version`, `supervisor_schema_version`) and to enumerate
+`dags`. Source language is hard-coded to `go`; SDK version is read from the
+build info embedded in the binary or from a build-time `-ldflags` value.
+
+- **Pros:** simple, single-purpose binary; works against any binary the user
+ built however they like (CGO, cross-compile, custom `-ldflags`); no
+ coupling to `go build` invocation; trivially callable from `just`,
+ `make`, CI, or `go generate`.
+- **Cons:** two-step UX (`go build` then `airflow-go-pack`); user has to
+ install or `go run` the tool; nothing prevents pack/build mismatch
+ (e.g. packing yesterday's binary).
+
+### Option B: All-in-one SDK CLI with a `build` subcommand
+
+A single SDK CLI (`airflow-go`) with subcommands that wrap `go build` and
+then pack:
+
+```
+airflow-go build ./example/bundle --output ./bin/example.zip
+```
+
+Internally: spawn `go build -o <tmp>/bundle <pkg>`, then run the same
+introspection step as Option A, then write the ZIP.
+
+- **Pros:** single command; no chance of pack/build skew; easy to add
+ related subcommands later (`airflow-go new`, `airflow-go run`,
+ `airflow-go validate`); good defaults for `-ldflags` (e.g.
+ `-X main.bundleVersion=...`) without the author having to know them.
+- **Cons:** the SDK now owns a `go build` wrapper and inherits
+ responsibility for forwarding the long tail of `go build` flags
+ (`-tags`, `-trimpath`, `GOOS`/`GOARCH` env, `-ldflags` passthrough,
+ `-buildvcs`, etc.); harder to integrate with non-trivial existing build
+ systems that already drive `go build` themselves.
+
+### Option C: Self-packing binary (`--pack-bundle <out.zip>`)
+
+Extend `bundlev1server.Serve` so that when the binary is invoked with
+`--pack-bundle <out.zip>`, it builds the ZIP itself: it knows its own
+executable path (`os.Executable()`), its embedded source (via `//go:embed`
+of the DAG source file at build time), and its dag/task list (by
+calling `RegisterDags` against an in-memory recorder). After writing
+the archive, it exits.
+
+- **Pros:** zero extra tools; the binary is fully self-describing; pack
+ output is provably consistent with the binary's runtime behaviour.
+- **Cons:** requires the author's `main` package to embed its own source
+ (`//go:embed main.go` or similar), which is awkward when the DAG is
+ spread across multiple files or the source path is non-obvious;
+ bloats every bundle binary with packing code and an embedded copy of
+ the source; mixes build-time concerns into a runtime entrypoint.
+
+### Option D: Two-phase external introspection (introspection binary + packer)
+
+Same shape as Option A or B, but standardise the introspection contract:
+the SDK guarantees that every bundle binary supports a single
+`--airflow-metadata` flag which prints a JSON blob containing
+`sdk.language`, `sdk.version`, `sdk.supervisor_schema_version`, and the
+full `dags` mapping. The packer's only job is to combine that JSON, the
+source file path the user passes in, and the binary itself into a bundle.
+
+This is really a refinement of A/B that fixes the introspection contract
+in the SDK protocol, rather than an independent option, but is worth
+calling out because the shape of the introspection flag is itself a
+decision (single flag vs. several; JSON vs. YAML; pretty vs. compact).
+The SDK settles that sub-decision in favour of a *single*
+`--airflow-metadata` flag.
+
+- **Pros:** decouples "how do we enumerate dags" from "how do we ZIP";
+ any future packer (third-party CI plugin, IDE, etc.) can rely on the
+ same contract; trivial to unit-test.
+- **Cons:** locks in a wire format the SDK has to keep stable; slightly
+ more code in the bundle binary than today; exec-based introspection
+ requires a host-runnable binary, so cross-compile targets (and
+ `--executable` paths that hand the packer a pre-built cross-target
+ binary) force the packer to produce a host-arch sidecar purely to
+ run `--airflow-metadata`. See [ADR
0002](0002-use-go-tool-directive-for-bundle-packer.md)
+ for the pipeline.
+
+### Option E: Static AST scan, no introspection
+
+Parser-only packer: walk the DAG source AST, find `dagbag.AddDag("X")`
+calls and the `.AddTask(fn)` calls chained off them, and synthesise the
+manifest without running the binary.
+
+- **Pros:** no runtime dependency on the binary (works even if it
+ doesn't build for the host platform, e.g. cross-compiled for Linux on
+ a macOS dev box); fast.
+- **Cons:** brittle to anything dynamic (`for _, name := range names {
+ dagbag.AddDag(name) }`, helper functions, generated code); the SDK
+ becomes the second source of truth for dag/task identity, which can
+ drift from `RegisterDags`; users will hit "I added a DAG and the
+ packer didn't see it" failures.
+
+### Option F: `go generate` directive
+
+Document a recommended `//go:generate` line in the author's `main.go`:
+
+```go
+//go:generate go run github.com/apache/airflow/go-sdk/cmd/airflow-go-pack ...
+```
+
+`go generate` is then the build-time entrypoint.
+
+- **Pros:** stdlib-blessed mechanism; no new tool installation needed
+ (`go run` fetches the packer from the module cache); discoverable from
+ the source file itself.
+- **Cons:** `go generate` has to be run explicitly; users frequently
+ forget; doesn't actually pack the *binary*, only triggers a tool that
+ must still build and pack it; fits awkwardly because the natural
+ ordering is `pack` -> `build`, not `build` -> `pack`.
+
+### Option G: `go build -toolexec` wrapper
+
+Provide a binary that the user passes as
+`go build -toolexec=airflow-go-toolexec ...`. The wrapper proxies every
+toolchain call and, on detecting the final `link` invocation, copies the
+linker's output path, then runs the packing step against it.
+
+- **Pros:** single `go build` invocation; nominally fits into existing
+ `go build` workflows.
+- **Cons:** `-toolexec` was not designed for this. It receives the
+ toolchain executable path and an argv that varies across Go versions;
+ the wrapper has to parse the linker's `-o` to discover the binary
+ location, which is undocumented/internal behaviour and changes
+ between releases. It also runs once per package compile, so the
+ wrapper must distinguish "real" link invocations from intermediate
+ ones. Operationally fragile; recommended against.
+
+### Option H: `go.mod` `tool` directive (Go 1.24+)
+
+Register the packer in the consuming project's `go.mod` via the
+`tool` directive and invoke it as `go tool airflow-go-pack`. This is
+orthogonal to A/B/D (it's a *delivery* mechanism, not a different
+implementation) but is worth listing because it changes the install
+story significantly.
+
+- **Pros:** version-pinned per project; no `uv tool`-style global install;
+ works the same in every checkout; aligns with `breeze`'s direction
+ ([ADR
0017](../../dev/breeze/doc/adr/0017-use-uvx-to-run-breeze-from-local-sources.md)
+ for the Python side).
+- **Cons:** requires Go 1.24 in the consumer's toolchain; one more line
+ the author has to add to `go.mod`.
+
+### Option I: Build-system recipe only (no SDK code)
+
+Ship a documented `Justfile` / `Makefile` / `Taskfile` snippet that
+sequences `go build` -> `zip` -> manifest write, and let users copy it
+into their projects. The SDK provides only the spec and an example.
+
+- **Pros:** zero new code in the SDK; users see exactly what is
+ happening.
+- **Cons:** every project re-implements (and slowly diverges from) the
+ spec; manifest generation in shell is painful; no introspection of
+ dag/task IDs without re-running the binary anyway, so the recipe
+ ends up calling some helper, which is just Option A by another name.
+
+## Cross-cutting sub-decisions
+
+These apply to whichever top-level option is chosen:
+
+1. **Manifest source of truth.** Runtime introspection (D) is the only
+ approach that cannot drift from `RegisterDags`. Everything else
+ trades that guarantee for some other property (no binary needed,
+ no extra flag, etc.).
+2. **Source-file discovery.** The spec requires the source file path
+ to appear in the manifest's `source` field and the file itself to
+ be present in the ZIP. The packer needs either (a) an explicit
+ `--source` argument, (b) a convention (e.g. the `main.go` of the
+ `main` package being built), or (c) a `//go:embed`-d copy inside
+ the binary (Option C).
+3. **SDK version reporting.** The bundle binary should expose the SDK
+ version it linked against. This can come from `runtime/debug.ReadBuildInfo`
+ walking the deps for `github.com/apache/airflow/go-sdk`, or from
+ a build-time `-ldflags -X` value the SDK documents.
+4. **Reproducibility.** The ZIP should be deterministic for a given
+ set of inputs (sorted entries, fixed mtimes, no host-specific
+ metadata) so two builds of the same bundle hash identically. This
+ is independent of which option is picked but easiest to enforce
+ inside SDK-owned code (A/B/C/D) than in a shell recipe (I).
+5. **Executable bit.** The spec says the archive entry SHOULD preserve
+ the executable bit via the ZIP external-attributes field. The
+ packer must set `0755` (or similar) on the executable entry; this
+ is trivial in Go's `archive/zip` but easy to get wrong in shell.
+
+## Decision
+
+Recorded in [ADR 0002](0002-use-go-tool-directive-for-bundle-packer.md).
+Summary: deliver the packer via the Go 1.24 `tool` directive (Option H);
+implement it as a standalone binary at `cmd/airflow-go-pack` (Option A);
+populate the manifest by execing the bundle binary with the standardised
+`--airflow-metadata` introspection flag (Option D).
+
+## Consequences
+
+Listing the options here, rather than only landing the chosen one,
+keeps the rejected alternatives discoverable for future SDKs (Rust,
+C++, Zig) which will face the same question, and documents why
+`-toolexec` and AST-only scanning were considered and dropped.
diff --git a/go-sdk/adr/0002-use-go-tool-directive-for-bundle-packer.md
b/go-sdk/adr/0002-use-go-tool-directive-for-bundle-packer.md
new file mode 100644
index 00000000000..08bd89a8b5f
--- /dev/null
+++ b/go-sdk/adr/0002-use-go-tool-directive-for-bundle-packer.md
@@ -0,0 +1,262 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 2. Use the Go 1.24 `tool` directive to deliver the bundle packer
+
+Date: 2026-04-30
+
+## Status
+
+Accepted. Selects from the option register in
+[ADR 0001](0001-bundle-packing-options.md).
+
+The output-format portion of this ADR (the packer writes a ZIP archive
+following the bundle spec) is superseded by
+[ADR 0004](0004-self-contained-executable-bundle.md): the packer now
+writes a self-contained executable with an appended footer carrying
+the source bytes and the manifest. The packer's *mechanism* (Option
+A standalone binary + Option D introspection contract + Option H
+`tool` directive) is unchanged. The decision sketches below mention
+ZIP output; read them with the ADR 0004 substitution in mind, and
+treat ADR 0004 as authoritative wherever the two disagree.
+
+## Context
+
+[ADR 0001](0001-bundle-packing-options.md) enumerated nine candidate
+mechanisms for producing a conforming bundle ZIP
+([`task-sdk/docs/executable-bundle-spec.rst`](../../task-sdk/docs/executable-bundle-spec.rst))
+from a Go SDK build. Two reasons drive the choice:
+
+1. **The repository already requires Go 1.24.** `go-sdk/go.mod` declares
+ `go 1.24.0` with `toolchain go1.24.6`, so language features added in
+ 1.24 are available to every consumer of the SDK by construction.
+2. **Contributors expect Go-native workflows.** The Go 1.24 `tool`
+ directive is the toolchain's native answer to "depend on a
+ build-time CLI without polluting the global PATH." It pins the tool
+ version per-project in `go.mod`, resolves it through the standard
+ module cache, and exposes it as `go tool <name>`, with no extra
+ installer and no per-worktree drift. The same problem on the Python
+ side led `breeze` to switch to `uvx` in
+ [ADR
0017](../../dev/breeze/doc/adr/0017-use-uvx-to-run-breeze-from-local-sources.md);
+ `tool` is the analogous answer here.
+
+The `tool` directive is a delivery mechanism. It still needs an
+underlying implementation. We pair it with two of the implementation
+options from ADR 0001, with a UX twist:
+
+- **Option A (standalone packer):** a single-purpose binary at
+ `go-sdk/cmd/airflow-go-pack`. It still operates as one process with
+ a clear input/output contract, but it drives `go build` internally
+ by default so that the common case is one command:
+ `go tool airflow-go-pack ./pkg`. Authors who already produce their
+ own binary can opt out via `--executable <path>` and skip the build
+ phase. This is closer to Option B's ergonomics than the original
+ ADR 0001 sketch, but the packer never interprets `go build` flags
+ itself — arbitrary flags pass through verbatim after the `--`
+ separator, so the SDK's flag surface stays small and stable as
+ `go build` evolves.
+- **Option D (standardised introspection contract):** every bundle
+ binary supports a `--airflow-metadata` flag that prints JSON
+ containing `sdk.language`, `sdk.version`, `sdk.supervisor_schema_version`,
+ and the full `dags` mapping. The packer execs the freshly built binary
+ with this flag to populate the manifest, which keeps `RegisterDags` as
+ the single authoritative source of dag/task identity.
+
+Options C, E, F, G, and I from ADR 0001 are rejected for the reasons
+recorded there. Option B (a full `airflow-go build` wrapping `go
+build`) is rejected as a separate top-level command, but its core
+ergonomic win (one command for the common case) is folded into the
+Option A packer through default behaviour, with a `--` passthrough
+convention so authors can forward arbitrary flags to the underlying
+`go build` without the SDK having to enumerate them.
+
+## Decision
+
+1. **Add `cmd/airflow-go-pack` to the go-sdk module.** Its default
+ invocation is one command:
+
+ ```sh
+ go tool airflow-go-pack [./path/to/pkg] [-- <go build flags>...]
+ ```
+
+ The single positional argument is the Go package containing the
+ bundle's `main` package (defaults to `.`, the current directory).
+ Anything after `--` is forwarded verbatim to the internal
+ `go build` invocation, so authors keep full control over `-tags`,
+ `-trimpath`, `-ldflags`, `GOOS`/`GOARCH` (via env), `-buildvcs`,
+ etc. without the packer having to enumerate them.
+
+ With no further flags, the packer:
+
+ 1. Resolves the target package and locates the file in that
+ package that defines `func main()`. That file becomes the
+ manifest's `source` and is copied verbatim into the ZIP. If
+ `main` is split across multiple files, the packer errors and
+ asks the author to specify `--source <file>`.
+ 2. Runs `go build [forwarded flags] -o <tmpdir>/<binname> <pkg>`
+ to produce the *target artifact*.
+ 3. Executes a *host-runnable introspection binary* with
+ `--airflow-metadata` to obtain
`sdk.{language,version,supervisor_schema_version}`
+ and the `dags` mapping. The packer first tries to exec the target
+ artifact directly; if that fails with "exec format error" (the
+ target is built for a different OS/arch than the host), the
+ packer builds a host-arch sidecar from the same package
+ (`go build` with `GOOS`/`GOARCH` unset, written to
+ `<tmpdir>/<binname>.introspect`) and execs that instead. Both
+ binaries are produced from the same source package against the
+ same module graph, so `RegisterDags` records identical dag/task
+ identity regardless of which one is execed.
+ 4. Writes a deterministic ZIP next to the working directory at
+ `<bundleName>.zip`, where `<bundleName>` comes from the
+ binary's `BundleInfo.Name` (part of the same `--airflow-metadata`
+ introspection output).
+
+ Optional overrides, for advanced or pre-built workflows:
+
+ - `--source <path>`: override the auto-detected source file.
+ - `--executable <path>`: skip the internal `go build` and pack a
+ pre-built binary. Mutually exclusive with `--` build-flag
+ passthrough. If the supplied binary is not host-runnable (e.g.
+ the user cross-built a `linux/amd64` binary from a `darwin/arm64`
+ host), the packer still needs to introspect it: it builds a
+ host-arch sidecar from the positional package and execs that for
+ `--airflow-metadata`, then appends the resulting footer to the
+ user-supplied binary. If no positional package was passed and
+ the supplied binary is not host-runnable, the packer errors with
+ a message asking for the source package so the sidecar can be
+ built.
+ - `--output <path>`: override the default output ZIP path.
+
+ Examples:
+
+ ```sh
+ # Common case: build and pack in one command from the package dir.
+ go tool airflow-go-pack
+
+ # Pack a different package, with extra go build flags.
+ go tool airflow-go-pack ./cmd/my-bundle -- -trimpath -tags=prod
+
+ # Pack an already-built binary (skips go build).
+ go tool airflow-go-pack --executable ./build/example --source main.go
+ ```
+
+2. **Extend the existing `--airflow-metadata` flag in
+ `bundlev1server.Serve` to print the full spec.** Rather than adding a
+ second introspection flag, `--airflow-metadata` is the single flag the
+ packer relies on; it prints a JSON document of the form:
+
+ ```json
+ {
+ "airflow_bundle_metadata_version": "1.0",
+ "sdk": {
+ "language": "go",
+ "version": "<sdk version>",
+ "supervisor_schema_version": "<YYYY-MM-DD>"
+ },
+ "dags": {
+ "<dag_id>": {"tasks": ["<task_id>", "..."]}
+ }
+ }
+ ```
+
+ `sdk.version` is read from `runtime/debug.ReadBuildInfo()` by
+ walking deps for `github.com/apache/airflow/go-sdk`;
+ `sdk.supervisor_schema_version` is the dated AIP-72 wire-schema
+ version the bundle was compiled against. The `dags` mapping is
+ populated by calling the bundle's `RegisterDags` against an in-memory
+ recording registry, then enumerating the recorded dags and their
+ tasks. `--airflow-metadata` today prints only `BundleInfo`
+ (`server.go`); it is extended to emit this full document, so the
+ shipped `decideMode` switch needs only one metadata mode. The bundle's
+ `BundleInfo.Name` (used by the packer for the default output
+ filename) is carried in the same output.
+
+3. **Bundle authors register the packer in their own `go.mod`:**
+
+ ```
+ tool github.com/apache/airflow/go-sdk/cmd/airflow-go-pack
+ ```
+
+ and pack in one step:
+
+ ```sh
+ go tool airflow-go-pack
+ ```
+
+4. **Update `go-sdk/example/bundle/Justfile`** to demonstrate the
+ recipe end-to-end, including the `tool` directive in the example's
+ own `go.mod`.
+
+## Consequences
+
+- Per-project, per-worktree version pinning of the packer through
+ `go.mod`. No global install. Two checkouts on different SDK versions
+ pack with the right packer for each.
+- Authors get a Go-native, single-command workflow for the common case
+ (`go tool airflow-go-pack`), with a `--` passthrough escape hatch
+ for arbitrary `go build` flags and `--executable` for pre-built
+ workflows. CI and other build systems can use whichever shape fits.
+- The `--airflow-metadata` JSON becomes a stable wire format that the
+ SDK has to keep backward-compatible. It is versioned implicitly
+ through the bundle spec's `airflow_bundle_metadata_version` field, so
+ additive changes are safe.
+- Third-party tooling (IDE plugins, alternative packers, CI plugins)
+ can rely on the same introspection contract without taking a Go
+ dependency on the SDK.
+- The packer takes on responsibility for locating the `main` source
+ file and choosing a sensible default output path. Both are
+ heuristics; both are overridable. Drift between the heuristic and
+ the spec is the main maintenance cost introduced by this option.
+- Requires Go 1.24 in any consumer project. Already a project-wide
+ assumption.
+
+## Implementation notes
+
+- The ZIP must be deterministic: sorted entries, fixed mtimes, no
+ host-specific metadata, so two builds of identical inputs hash the
+ same.
+- The executable entry's external-attributes field must encode mode
+ `0755` (or similar), per the bundle spec's executable-bit
+ requirement.
+- The packer should validate that the manifest's `dags` is non-empty
+ and warn (not fail) on empty `tasks` lists, matching the bundle
+ spec's "permitted but discouraged" wording.
+- `--airflow-metadata` runs `RegisterDags` but must not start the
+ gRPC server or contact any external services; the in-memory
+ registry recorder is the only side effect.
+- Source-file detection uses `go/packages` (or `go list -json`) to
+ load the target package, then picks the file whose AST contains a
+ top-level `func main()`. If the package has zero or more than one
+ such file, the packer errors with a clear message and asks for
+ `--source`.
+- The internal `go build` runs in a temp directory so the host's
+ working tree is not polluted with build artefacts; the temp dir is
+ cleaned up after the ZIP is written.
+- `go build` flag passthrough uses the standard `--` separator
+ convention so the packer's own flag set stays small and stable.
+- Host-runnable detection is by attempted exec, not by parsing the
+ binary's exec format. The packer runs the candidate introspection
+ binary with `--airflow-metadata` and treats the OS's "exec format
+ error" (and the Windows equivalent surfaced by `os/exec`) as the
+ signal to fall back to building a host-arch sidecar. Other exec
+ failures (non-zero exit, malformed JSON, missing flag) are real
+ errors and are surfaced to the user as-is. The Go build cache
+ amortises the sidecar to a link step when host arch is already
+ involved, so there is no measurable overhead when no cross-compile
+ is in play.
diff --git a/go-sdk/adr/0003-coordinator-protocol-msgpack-ipc.md
b/go-sdk/adr/0003-coordinator-protocol-msgpack-ipc.md
new file mode 100644
index 00000000000..82798bdeb10
--- /dev/null
+++ b/go-sdk/adr/0003-coordinator-protocol-msgpack-ipc.md
@@ -0,0 +1,404 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 3. Dual-mode bundle binary: msgpack-over-IPC coordinator protocol alongside
the existing go-plugin/Edge-Worker path
+
+Date: 2026-04-30
+
+## Status
+
+Accepted.
+
+The references in this ADR to a "ZIP bundle" — the bundle-spec phrasing
+quoted in Context, and the `airflow-go-pack` output described in
+Consequences — are superseded by
+[ADR 0004](0004-self-contained-executable-bundle.md), which replaces
+the ZIP container with a self-contained executable carrying the source
+and manifest in an appended footer. The coordinator-mode protocol
+decision in this ADR is unaffected: the binary still honours
+`--comm=<addr>` / `--logs=<addr>` exactly as described, regardless of
+the container format it ships inside. Read the ZIP mentions below with
+the ADR 0004 substitution in mind.
+
+## Context
+
+A Go SDK bundle binary today (the artefact built from
+[`go-sdk/example/bundle/main.go`](../example/bundle/main.go) via
+`bundlev1server.Serve`) speaks exactly one protocol: HashiCorp
+[`go-plugin`](https://github.com/hashicorp/go-plugin) gRPC over a
+stdio-negotiated socket, gated by the magic-cookie handshake declared in
+[`pkg/bundles/shared/handshake.go`](../pkg/bundles/shared/handshake.go).
+The Airflow Go *Edge Worker*
+([`cmd/airflow-go-edge-worker`](../cmd/airflow-go-edge-worker/main.go),
+[`edge/`](../edge)) is the consumer of that protocol — it execs the
+bundle binary as a child process, completes the go-plugin handshake,
+opens the `DagBundle` gRPC client, and drives `GetMetadata`/`Execute`
+([`bundle/bundlev1/bundlev1server/impl/plugin.go`](../bundle/bundlev1/bundlev1server/impl/plugin.go)).
+The bundle binary never listens on a public socket; the protocol is
+local-process only.
+
+Meanwhile, the Python side of Airflow has standardised on a different
+wire protocol for non-Python language runtimes — the *coordinator
+protocol* — pioneered by the Java SDK and described in
+[java-sdk ADR 0004](../../java-sdk/adr/0004-dag-parsing.md)
+and
+[java-sdk ADR 0002](../../java-sdk/adr/0002-workload-execution.md).
+Its shape is:
+
+- The runtime is launched with `--comm=<host:port>` and
+ `--logs=<host:port>` CLI arguments.
+- It connects out (TCP, loopback) to both addresses.
+- Frames on the comm channel are length-prefixed msgpack: a 4-byte
+ big-endian length followed by the msgpack payload. Requests are
+ `[id, body]`; responses are `[id, body, error]`.
+- Two workloads share one channel, distinguished by the first inbound
+ frame: `DagFileParseRequest` (one-shot, returns
+ `DagFileParsingResult` and exits) or `StartupDetails` (multi-round
+ task execution: the runtime sends `GetConnection` / `GetVariable` /
+ `GetXCom` / `SetXCom` and terminates with `SucceedTask` or
+ `TaskState`).
+- The logs channel carries structured JSON log records emitted by the
+ runtime.
+
+The Python-side launcher is
+[`ExecutableCoordinator`](../../task-sdk/src/airflow/sdk/coordinators/executable/coordinator.py),
+which already builds command lines of the form
+`<binary> --comm=<addr> --logs=<addr>` for both `dag_parsing_runtime_cmd`
+and `task_execution_runtime_cmd`. The bundle-spec contract
+([`task-sdk/docs/executable-bundle-spec.rst`](../../task-sdk/docs/executable-bundle-spec.rst))
+ratifies that any compiled SDK shipping a bundle "MUST honour the
+SDK coordinator protocol (`--comm=<addr>` / `--logs=<addr>`
+socket-based IPC)". The Java SDK satisfies this contract; the Go SDK
+currently does not.
+
+The two protocols target different deployment shapes:
+
+- **go-plugin / Edge Worker.** The Go-native worker is itself a long-running
+ process that loads bundles in-process and dispatches tasks to them
+ over gRPC. It is the only consumer that speaks go-plugin to a Go
+ bundle today, and it owns the full task-runtime stack on the worker
+ host (no Python in the data path). This is the path
+ [`go-sdk/example/bundle/main.go`](../example/bundle/main.go) was
+ written for and the path that
+ [`pkg/worker`](../pkg/worker) drives.
+- **Coordinator / `ExecutableCoordinator`.** The Python task
+ runner forks a child that runs `<binary> --comm=… --logs=…`,
+ bridges its socket to the Airflow supervisor's fd 0, and proxies
+ Airflow service calls (`GetConnection`, `GetVariable`, ...) through
+ to the Execution API. This is how Airflow runs non-Python tasks
+ *without* a per-language worker — the same way Java runs today, and
+ the same way Rust/C++/Zig will run in the future. It is also the
+ only path the executable provider's bundle spec recognises.
+
+Today these two paths require two different binaries, even though the
+DAG/task definitions, the registry, the worker plumbing, and the
+serialisation surfaces overlap almost entirely. That is the gap this
+ADR closes.
+
+The user-written `main()` is one line —
+`bundlev1server.Serve(&myBundle{})` — and we want to keep it one line.
+Whichever protocol the binary should speak must be decided inside
+`Serve` based on how it was invoked, not by branching in user code.
+
+## Decision
+
+Make the SDK bundle binary **dual-mode**. A single
+`bundlev1server.Serve(bundle, opts...)` call dispatches to one of two
+protocol servers based on its CLI arguments and process environment.
+User code does not change.
+
+### Invocation matrix
+
+`Serve` evaluates the triggers below in order via a `decideMode`
+switch (`server.go`); the first match wins, and go-plugin is the
+default when no flag selects another mode.
+
+| # | Trigger | Mode
| Behaviour |
+|----|--------------------------------------------------------|-----------------|-----------|
+| 1 | `--airflow-metadata` | metadata-dump
| The single introspection flag (ADR 0001 / ADR 0002, `server.go`). Prints
the bundle's `airflow-metadata.yaml` spec as JSON and exits; this is the flag
`airflow-go-pack` execs to populate the manifest. |
+| 2 | `--comm=<host:port> --logs=<host:port>` |
**coordinator** | New. Speaks the msgpack-over-IPC coordinator protocol. Both
flags are required. |
+| 3 | exactly one of `--comm` / `--logs` | error
| Partial coordinator selection is a hard error
(`ErrCoordinatorFlagsIncomplete`), returned to `main` so the caller exits
non-zero with usage rather than silently falling back to go-plugin. |
+| 4 | none of the above (default) | go-plugin
| Existing behaviour. Falls through to `plugin.Serve`, which performs the
`AIRFLOW_BUNDLE_MAGIC_COOKIE` handshake and serves `DagBundle` gRPC to the Edge
Worker. Running the binary by hand outside an Edge Worker fails the handshake
with a diagnostic. |
+
+The two server modes share the same `bundlev1.BundleProvider`
+implementation and the same lazy `RegisterDags` recorder cache that
+`impl.server` already maintains (`impl/plugin.go:99-121`). Only the
+front door changes.
+
+### Coordinator mode: protocol details
+
+When `Serve` enters coordinator mode it:
+
+1. **Parses and validates the addresses.** Both `--comm` and `--logs`
+ are `host:port` strings. `127.0.0.1` is the only host the coordinator
+ protocol is designed for, but we do not pin it — the value is whatever
+ `_runtime_subprocess_entrypoint` chose on the Python side.
+
+2. **Connects out** to the comm address, then to the logs address. Both
+ are TCP. We dial; we do not listen. The launcher already has both
+ listeners up before exec'ing the binary
+ ([java-sdk ADR 0004, "What the Base Class Handles
Automatically"](../../java-sdk/adr/0004-dag-parsing.md#what-the-base-class-handles-automatically)).
+
+3. **Routes structured logs to the logs socket.** A new
+ `slog.Handler` writes JSON-line records (one record per line, UTF-8,
+ newline-terminated) to the logs connection, replacing the
+ `hclog`/stderr handler used in go-plugin mode. `slog.SetDefault` is
+ called before any user code runs so `log` arguments injected into
+ tasks land on the right channel. On disconnect the handler falls
+ back to stderr so the binary never deadlocks on a closed sink.
+
+4. **Reads the first comm frame and dispatches by message type.** The
+ first frame's body has a `type` field per the Java SDK's encoding
+ ([java-sdk ADR 0002, "Task SDK Protocol
Messages"](../../java-sdk/adr/0002-workload-execution.md#task-sdk-protocol-messages)).
+ Two values are valid here:
+
+ - `DagFileParseRequest` → DAG-parsing one-shot.
+ - `StartupDetails` → task execution.
+
+ Any other type is an error frame back to the supervisor and
+ `os.Exit(1)`.
+
+#### DAG-parsing path (`DagFileParseRequest` → `DagFileParsingResult`)
+
+```text
+Supervisor Bundle binary (Go)
+ │ │
+ ├── [4B len][msgpack: id, ─────────────►│
+ │ {type: "DagFileParseRequest", │
+ │ file: "<bundle path>"}] │
+ │ │
+ │ ├──
BundleProvider.RegisterDags(reg)
+ │ │ (cached, same as gRPC path)
+ │ │
+ │ ├── serialise(reg) →
+ │ │ DagFileParsingResult
+ │ │ in DagSerialization v3 JSON
+ │ │ (see java-sdk ADR-0004)
+ │ │
+ │◄────────────────[4B len][msgpack: ────┤
+ │ id, {type: "DagFileParsingResult",
+ │ fileloc: "...",
+ │ serialized_dags: [...] }] │
+ │ │
+ │ └── close + exit(0)
+```
+
+The serialised DAG payload must match Python's `SerializedDAG.serialize_dag`
+output **exactly**, including the `__type` / `__var` wrapping rules,
+unwrapping of "non-decorated" fields (`start_date`, `end_date`, `tags`),
+and the timetable encoding listed in
+[java-sdk ADR 0004, "DagFileParsingResult
Format"](../../java-sdk/adr/0004-dag-parsing.md#dagfileparsingresult-format).
+The Go SDK gains a `serde` package that performs this encoding from
+`bundlev1.Bundle` / `bundlev1.Task`, validated against
+`validation/serialization/test_dags.yaml` (the same fixture set the Java
+SDK uses), so the Go and Java outputs are byte-identical for shared
+inputs.
+
+#### Task-execution path (`StartupDetails` → multi-round → `SucceedTask` /
`TaskState`)
+
+```text
+Supervisor Bundle binary (Go)
+ │ │
+ ├── StartupDetails ────────────────────►│
+ │ (ti, dag_rel_path, bundle_info, │
+ │ start_date, ti_context) │
+ │ │
+ │ ├── lookup task:
+ │ │ bundle.dags[ti.dag_id]
+ │ │ .tasks[ti.task_id]
+ │ │ (returns
TaskState{state:"removed"}
+ │ │ if not found, mirroring Java)
+ │ │
+ │ ├── construct sdk.Client whose
+ │ │ GetConnection / GetVariable /
+ │ │ GetXCom / SetXCom calls block
on
+ │ │ request/response over the
+ │ │ comm socket
+ │ │
+ │◄── GetConnection(conn_id) ────────────┤
+ ├── ConnectionResult ──────────────────►│
+ │◄── GetVariable(key) ──────────────────┤
+ ├── VariableResult ────────────────────►│
+ │◄── GetXCom(...) ──────────────────────┤
+ ├── XComResult ────────────────────────►│
+ │◄── SetXCom(...) ──────────────────────┤
+ ├── (empty response) ──────────────────►│
+ │ │
+ │ ├── task fn returns:
+ │ │ err == nil → SucceedTask
+ │ │ err != nil →
TaskState{"failed"}
+ │ │ (panic recovered → "failed")
+ │ │
+ │◄── SucceedTask / TaskState ───────────┤
+ │ │
+ │ └── close + exit(0)
+```
+
+Concretely, this reuses
+[`pkg/worker.Worker`](../pkg/worker/runner.go) for task lookup and
+parameter injection — `extract(ctx, sdk.Client, *slog.Logger)`,
+`transform(ctx, sdk.VariableClient, *slog.Logger)`, and `load() error`
+in the example bundle work unchanged. The injected `sdk.Client`
+implementation is swapped: in go-plugin mode it talks to the Execution
+API directly via the URL from viper (`impl/plugin.go:182`), in
+coordinator mode it talks to the supervisor over the comm socket.
+Both implement the same `sdk.Client` / `sdk.VariableClient` interfaces,
+so user task code is identical between the two modes.
+
+The `(panic recovered → "failed")` step in the diagram is
+`pkg/worker.Worker.ExecuteTaskWorkload`'s existing `defer recover()` block
+(`runner.go:295-311`), which logs the panic and calls
+`reportStateFailed`. Because both modes reuse the same `Worker`,
+this behaviour is identical in go-plugin mode and coordinator mode;
+it is not a coordinator-only invention.
+
+Frame correlation, error envelopes, and request `id` numbering follow
+java-sdk ADR 0002 verbatim. Re-implementing rather than reusing those
+is a deliberate cost of having a separate Go runtime; the validation
+fixtures keep the encoders honest.
+
+### go-plugin mode: unchanged
+
+When neither `--airflow-metadata` nor `--comm`/`--logs` is set, `Serve`
+falls through to the existing call site:
+
+```go
+plugin.Serve(&plugin.ServeConfig{
+ HandshakeConfig: shared.Handshake,
+ Plugins: plugin.PluginSet{"dag-bundle":
&impl.BundleGRPCPlugin{...}},
+ GRPCServer: plugin.DefaultGRPCServer,
+})
+```
+
+The handshake env var (`AIRFLOW_BUNDLE_MAGIC_COOKIE`) gates the path
+the same way it does today, so an Edge Worker that execs the binary
+gets exactly the same protocol it gets today. The `DagBundle` gRPC
+service, the registry cache, and the worker injection in
+[`impl/plugin.go:178`](../bundle/bundlev1/bundlev1server/impl/plugin.go)
+are untouched. (`--airflow-metadata` itself is extended to emit the full
+bundle spec per ADR 0002, but the go-plugin path does not depend on its
+output.)
+
+### Code organisation
+
+A new internal package
+`go-sdk/bundle/bundlev1/bundlev1server/impl/coord` owns the
+coordinator-mode server: frame codec, log-sink handler, dag-parse
+handler, task-execution handler, and the `sdk.Client` adapter that
+proxies to the comm socket. It depends on a new
+`go-sdk/bundle/bundlev1/serde` package for DagSerialization v3
+encoding. The frame codec is small enough to keep first-party rather
+than pulling a new msgpack dependency at the API surface; we use
+[`github.com/vmihailenco/msgpack/v5`](https://github.com/vmihailenco/msgpack)
+internally.
+
+`bundlev1server.Serve` becomes:
+
+```go
+func Serve(bundle bundlev1.BundleProvider, opts ...ServeOpt) error {
+ config.SetupViper("")
+ flag.Parse()
+
+ switch decideMode() {
+ case modeMetadataDump:
+ return dumpBundleMetadata(bundle) // --airflow-metadata: full
spec JSON (ADR 0002)
+ case modeCoordinator:
+ return coord.Serve(bundle, *commAddr, *logsAddr) // NEW
+ case modePlugin:
+ return servePlugin(bundle) // existing go-plugin default
+ case modeCoordinatorUsageError:
+ return ErrCoordinatorFlagsIncomplete // partial --comm/--logs
+ }
+ return nil
+}
+```
+
+User code (`main.go`) is the same one line:
+
+```go
+func main() { bundlev1server.Serve(&myBundle{}) }
+```
+
+## Consequences
+
+### Capability gains
+
+- A single binary built from one `bundlev1server.Serve` entry point now
+ runs under both the Go-native Edge Worker (go-plugin) and the
+ Python-native task runner via `ExecutableCoordinator`
+ (msgpack-over-IPC). Authors do not pick a deployment shape at build
+ time.
+- The bundle artefact produced by `airflow-go-pack` (ADR 0002, as
+ revised by [ADR 0004](0004-self-contained-executable-bundle.md))
+ becomes spec-conformant
+
([`task-sdk/docs/executable-bundle-spec.rst`](../../task-sdk/docs/executable-bundle-spec.rst))
+ without further changes, because the binary now honours
+ `--comm=<addr>`/`--logs=<addr>` as the spec demands.
+- Mixed-language pipelines (Python `@task.stub` DAGs delegating to a Go
+ task) work without a Go worker on the executor host — the same
+ coordinator the Java SDK rides on now carries Go.
+
+### Compatibility
+
+- The go-plugin path is unchanged at the wire and at the source level.
+ Existing Edge Worker deployments do not need to be rebuilt or
+ reconfigured. The protocol selector keys off CLI flags and the
+ go-plugin magic-cookie env var, both of which the Edge Worker
+ already sets.
+- `--airflow-metadata` remains the only introspection flag; extending it
+ to emit the full bundle spec (ADR 0002) is additive and does not affect
+ the go-plugin path. Adding a binary with this ADR's changes into an
+ older Edge Worker deployment is safe; adding an older binary into an
+ `ExecutableCoordinator` deployment fails fast with a clear "unknown
+ flag: --comm" stderr message rather than hanging.
+
+### New ongoing costs
+
+- The Go SDK now owns a second wire protocol. Encoder drift between
+ Python's `SerializedDAG.serialize_dag` and the Go `serde` package is
+ the largest maintenance hazard. We mitigate it by sharing
+ `validation/serialization/test_dags.yaml` with the Java SDK and
+ running the same `compare.py` step in CI for Go output.
+- The Task SDK message catalogue (`GetConnection`, `GetVariable`,
+ `GetXCom`, `SetXCom`, `SucceedTask`, `TaskState`,
+ `ConnectionResult`, `VariableResult`, `XComResult`, `ErrorResponse`,
+ `StartupDetails`, `DagFileParseRequest`, `DagFileParsingResult`) is
+ duplicated from the Java SDK's Kotlin definitions. Schema changes on
+ the Python side need both SDKs updated together; a single
+ `task-sdk/protocol/` JSON-schema source of truth is a reasonable
+ follow-up but is out of scope here.
+- A new transitive dependency on `vmihailenco/msgpack/v5`. It is
+ pure-Go and stable; the cost is acceptable.
+- The `sdk.Client` interface gains a second backend (comm-socket).
+ Tests that previously injected a fake `sdk.VariableClient` (see
+ [`example/bundle/main_test.go`](../example/bundle/main_test.go)) keep
+ working unchanged — the swap is below the SDK surface.
+
+### Out of scope
+
+- The logs channel format. We emit JSON-line records to match the Java
+ SDK; a richer protocol (severity-aware framing, attachment of trace
+ ids) is deferred until the Python supervisor side standardises one.
+- OTel context propagation. The `context_carrier` field on
+ `TaskInstance` is still TODO in
+ [`impl/plugin.go:151`](../bundle/bundlev1/bundlev1server/impl/plugin.go#L151)
+ and remains TODO in coordinator mode for now.
+- A Go-side equivalent of the Java SDK's `Supervisor.kt` (the
+ no-Python-in-the-loop execution path). The Edge Worker already fills
+ that role for Go via go-plugin; we do not need a second one.
diff --git a/go-sdk/adr/0004-self-contained-executable-bundle.md
b/go-sdk/adr/0004-self-contained-executable-bundle.md
new file mode 100644
index 00000000000..9ee9dd70624
--- /dev/null
+++ b/go-sdk/adr/0004-self-contained-executable-bundle.md
@@ -0,0 +1,377 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 4. Self-contained executable bundle (footer-embedded source and metadata)
+
+Date: 2026-05-04
+
+## Status
+
+Accepted. Supersedes the ZIP-archive container portion of
+[ADR 0001](0001-bundle-packing-options.md) and the ZIP output sketched
+in [ADR 0002](0002-use-go-tool-directive-for-bundle-packer.md). The
+packer mechanism (Option A standalone packer + Option D introspection
+contract + Option H `tool` directive) is unchanged; only the artefact
+the packer writes is changed.
+
+## Context
+
+ADR 0001 / ADR 0002 picked a ZIP archive as the bundle container,
+following the executable provider's
+[`task-sdk/docs/executable-bundle-spec.rst`](../../task-sdk/docs/executable-bundle-spec.rst).
+A conforming bundle in that earlier design was `bundle.zip` with three
+required entries: `airflow-metadata.yaml`, the primary DAG source file,
+and the compiled executable. (This ADR is what changed the container to
+the footer format the spec now documents.)
+
+That layout has three properties we want to preserve:
+
+1. **Discovery without execution.** The scanner must be able to read
+ `dag_id` / `task_id` and the SDK language/version from a bundle on
+ disk without running the binary. ADR 0002 already enforces this —
+ `airflow-go-pack` runs the binary once at build time, captures its
+ `--airflow-metadata` output into the manifest, and the scanner reads
+ the manifest at deploy time.
+2. **Source available for the UI.** The Airflow UI's source-view
+ panel needs to render the DAG file. The current spec ships it as a
+ verbatim ZIP entry referenced by the manifest's `source` field.
+3. **Single deployment unit.** Drop one file in the coordinator's
+ `executables_root` and the scanner picks it up.
+
+What the ZIP container costs us:
+
+- **Two artefacts in flight.** `go build` produces a binary; the
+ packer wraps it into a ZIP. Anything that touches the binary after
+ it is wrapped (re-strip, re-sign, swap-in a debug build) drifts from
+ the manifest unless the wrapping is redone. The wrapping step is
+ cheap but the drift mode is real.
+- **A second container format on the consumer side.** The scanner
+ must open archives, find members by name, and materialise the
+ executable into a transient cache before the runtime can exec it.
+ That is `archive/zip` on the Python side plus a per-bundle cache
+ directory.
+- **Inspection requires a different tool than running.** `unzip` to
+ inspect, then run; or run, then `unzip` to debug. Two muscle memories.
+
+Native-executable SDKs (Go, Rust, C++, Zig) all produce a single-file
+binary as their primary build output. The binary itself is already
+the only thing that has to land on the worker host to run a task. The
+manifest and the source file are small data the scanner needs but the
+runtime doesn't. Both can ride along in a footer appended to the
+binary, with the binary remaining a runnable executable.
+
+This is the same pattern self-extracting installers, `goreleaser`-style
+self-update images, and embedded-asset binaries already use: append
+data after the OS-recognised binary structure, leave a fixed-size
+trailer at the very end so a reader can locate the data, and validate
+with a magic value.
+
+The user-facing claim becomes "the executable *is* the bundle." A
+bundle directory looks like:
+
+```
+/opt/airflow/executable-bundles/
+├── example
+├── pipeline
+└── analytics
+```
+
+(Filenames follow OS conventions: no extension on Linux/macOS, `.exe`
+on Windows. The scanner identifies bundles by the trailer's magic,
+not by the filename.)
+
+## Decision
+
+Replace the bundle's ZIP container with a footer appended to the
+compiled executable. The executable's normal byte content is unchanged
+and it remains directly runnable; the footer is data that follows the
+last byte the OS loader cares about.
+
+### Footer layout
+
+A bundle file is laid out as:
+
+```text
++---------------------------------+
+| <native executable: ELF/Mach-O/PE, |
+| including any code-signing structures> |
++---------------------------------+ <- end of "binary" region
+| source bytes (variable length) | raw root source file, UTF-8,
+| | length = source_len; MAY be 0
++---------------------------------+
+| metadata bytes (variable length)| airflow-metadata.yaml content,
+| | UTF-8, length = metadata_len
++---------------------------------+
+| trailer (64 bytes, little-endian fixed layout): |
+| bytes 0..3 source_len u32 |
+| bytes 4..7 metadata_len u32 |
+| bytes 8..11 footer_ver u32 (= 1) |
+| bytes 12..43 binary_sha256 32 bytes (SHA-256 of binary region) |
+| bytes 44..55 reserved 12 bytes, zero |
+| bytes 56..63 magic 8 bytes ASCII "AFBNDL01" |
++---------------------------------+ <- EOF
+```
+
+`AFBNDL01` is `0x41 0x46 0x42 0x4E 0x44 0x4C 0x30 0x31`. The two
+trailing ASCII digits are the footer-format version, repeated for human
+inspection (`tail -c 8 ./mybundle | xxd`); the binary `footer_ver`
+field is the source of truth for parsing.
+
+`binary_sha256` is computed over the binary region only — bytes
+`[0, source_start)` — because the hash field is itself in the trailer
+and cannot cover the bytes it occupies. It provides integrity (the
+binary region has not been truncated, corrupted, or naively edited
+between packing and exec), not authenticity (see "Out of scope" for
+how authenticity layers on top).
+
+Reader algorithm:
+
+1. Open the file. Seek to `EOF - 64`. Read 64 bytes.
+2. Compare bytes 56..63 against `AFBNDL01`. If different, the file is
+ not a bundle; the scanner ignores it.
+3. Parse `footer_ver`. If unknown, fail with a versioning error.
+4. Compute `metadata_start = filesize - 64 - metadata_len` and
+ `source_start = metadata_start - source_len`.
+5. Read `metadata_len` bytes from `metadata_start` for the manifest.
+6. Read `source_len` bytes from `source_start` for the source view.
+ If `source_len == 0`, no source is embedded; the UI falls back to
+ "(source not available)".
+7. Validate that `source_start >= 0` and that the implied "binary
+ region" (bytes `[0, source_start)`) is non-empty.
+8. Compute SHA-256 over `[0, source_start)` and compare to
+ `binary_sha256`. Mismatch is a hard failure: the scanner logs and
+ skips the file, the same way it would on a magic-check failure.
+ The result is cached by `(path, inode, mtime, size)` so the
+ runtime does not re-hash on every exec; a cache miss (file
+ replaced, mtime bumped) triggers re-verification.
+
+Ordering note: source comes *before* metadata so a future
+`footer_ver` can introduce extra trailing blobs (e.g. signed
+checksums, compressed deps) by extending the trailer rather than
+inserting between existing blobs.
+
+### Manifest schema changes
+
+The manifest content is the same YAML as today, with two field-level
+changes that follow from the footer container:
+
+- **Drop `executable`.** The binary *is* the file; there is no
+ archive-relative path to record.
+- **Redefine `source` as a display filename, not a path.** The source
+ bytes live in the footer; the manifest's `source` field carries the
+ original filename (e.g. `example.go`) so the UI can show it as a
+ filename in the source-view panel and pick a syntax-highlighting
+ mode from the extension.
+
+Everything else (`airflow_bundle_metadata_version`, `sdk.language`,
+`sdk.version`, `sdk.supervisor_schema_version`, `dags`, the
+open-additivity rule for unknown keys) is unchanged.
+
+### Build pipeline
+
+The packer's behaviour from ADR 0002 changes only at the final write
+step:
+
+1. Resolve target package, locate the file with `func main()`. (No
+ change.)
+2. Run `go build [forwarded flags] -o <out> <pkg>`. (No change.)
+3. Exec the freshly built binary with `--airflow-metadata` to obtain
+ the manifest. (No change.)
+4. **New:** read the source file's bytes; serialise the manifest to
+ YAML; compute `binary_sha256 = SHA-256(<out>)` over the entire
+ on-disk file as it stands after step 2 (which *is* the binary
+ region — nothing has been appended yet); assemble the trailer
+ with the resulting digest; append `<source><metadata><trailer>`
+ to `<out>`.
+5. Default output path becomes `<bundleName>` (or `<bundleName>.exe`
+ on Windows), not `<bundleName>.zip`.
+
+Ordering against post-build steps:
+
+- **Strip:** must run *before* append. Stripping a file that already
+ has a footer either leaves the footer intact (most strip
+ implementations stop at the OS-defined end of the binary) or
+ truncates it; do not rely on either.
+- **Code-sign (optional):** the bundle format does not require OS
+ code-signing. The embedded `binary_sha256` provides integrity, and
+ Airflow's threat model treats `executables_root` as
+ Deployment-Manager-controlled — authenticity is a deployment-time
+ concern, not a bundle-format one. Deployment Managers who want
+ OS-enforced load gating (macOS `codesign` / `rcodesign`, Linux
+ fs-verity, IMA) layer it on top of the bundle: sign *after* the
+ footer append so the signature covers the trailer along with the
+ binary region. Windows Authenticode is incompatible with this
+ layout (see "Out of scope") but does not block Windows as a
+ bundle target.
+- **Compressors (UPX, etc.):** unsupported. UPX rewrites the file end
+ to end, destroying the trailer. Bundle binaries should not be
+ compressor-wrapped; this matches typical production deployment
+ practice.
+
+Determinism: the footer is byte-identical for byte-identical inputs
+(source bytes, manifest YAML, layout), so a deterministic `go build`
+plus a deterministic manifest serialisation produces a byte-identical
+bundle file. We canonicalise the manifest as sorted-key YAML at write
+time to avoid map-order non-determinism on the Go side.
+
+### Cross-language scope
+
+The bundle spec is language-agnostic by design. Every native-SDK
+language we currently target (Go, Rust, C++, Zig) emits a single-file
+native executable; appending a fixed-format footer is a few lines of
+code in each. The technique works because the OS loader stops at the
+format-defined end of image (ELF section/segment extents, Mach-O
+`LC_SEGMENT` extents) — it does not depend on the binary being
+statically linked, so dynamically-linked Rust or C++ artefacts on
+Linux take the footer cleanly. The footer layout above is the
+contract every SDK packer implements; the consumer-side scanner reads
+it identically regardless of source language.
+
+Interpreted languages without a single binary artefact are out of
+scope for the executable provider and therefore for this ADR.
+
+### Consumer-side changes
+
+The scanner currently iterates `*.zip` in `executables_root` and opens
+each as an archive. It now iterates *all* regular files, reads the
+last 64 bytes of each, and treats files whose magic matches as
+bundles. Files without the magic are silently ignored (so a stray
+README in the directory does not fail the scan). Files with matching
+magic are SHA-256-verified per the reader algorithm; a mismatch
+demotes the file back to "ignored, with an error log."
+
+The runtime no longer has to materialise an executable from an
+archive. It execs the bundle file directly, which removes the
+transient cache directory and the chmod-after-extract step from
+the spec. The integrity check is run by the scanner at discovery
+time and cached by `(path, inode, mtime, size)`, so the exec hot
+path does not re-hash.
+
+## Consequences
+
+### What this buys
+
+- **One artefact.** No `.zip` wrapper around a binary; the binary is
+ the deployment unit. `cp ./mybundle /opt/airflow/executable-bundles/`
+ is the deploy workflow.
+- **No drift between binary and manifest.** They are produced and
+ committed in the same step and physically attached.
+- **Atomic deploy.** A partially written file fails the magic or
+ `binary_sha256` check; the scanner skips it cleanly instead of
+ seeing half a manifest.
+- **Integrity built in.** `binary_sha256` catches truncation,
+ in-flight corruption, and naive tampering without any external
+ signing infrastructure. Authenticity (signed-by-trusted-identity)
+ is a separate concern that Deployment Managers can layer on top.
+- **Smaller consumer surface.** No `archive/zip` dependency, no
+ per-bundle cache directory, no chmod-after-extract path, no
+ external-attributes handling for the executable bit.
+- **Simpler runtime.** Exec the file directly.
+
+### What this costs
+
+- **Inspection needs a tool.** With ZIP, `unzip -p bundle.zip
+ airflow-metadata.yaml` worked from any shell. With the footer
+ format, ops needs a small CLI (`go tool airflow-go-pack inspect
+ ./mybundle` or equivalent) to dump the manifest and source. Cheap
+ to implement; the obligation is "ship it alongside the packer."
+- **Build pipeline ordering matters.** Strip-then-append-then-sign is
+ the only correct order. Documented in the packer and in this ADR;
+ failure modes (stripped trailer, signature over the wrong bytes)
+ are loud (magic check fails, signature verification fails) rather
+ than silent.
+- **Compressor incompatibility.** UPX and similar are not supported
+ for bundle binaries. Acceptable; production deployments do not
+ typically compress executables this way.
+- **Magic-collision handling.** A non-bundle file in
+ `executables_root` whose last 8 bytes happen to be `AFBNDL01` would
+ briefly look like a bundle. Probability is negligible for a fixed
+ 8-byte ASCII string, and the subsequent `footer_ver` /
+ bounds-check / SHA-256 verification rejects the collision
+ deterministically.
+- **TOCTOU between verify and `execve`.** The scanner verifies
+ `binary_sha256` at discovery time; the runtime later execs the
+ same path. In between, a write to the file would not be caught
+ until the cache key (`inode`, `mtime`, `size`) changes and the
+ scanner re-verifies. Acceptable for v1 because the threat model
+ treats `executables_root` as Deployment-Manager-controlled;
+ Deployment Managers who need stronger guarantees apply
+ OS-enforced load gating (fs-verity, IMA, codesign) on top.
+- **Footer format is now a wire format the SDK has to keep stable.**
+ `footer_ver = 1` is the only currently defined value; future
+ versions append fields after the version field but before the
+ reserved region, or use the reserved region. Older readers reject
+ unknown `footer_ver` rather than guessing.
+
+### Out of scope
+
+- **Authenticated bundle signatures.** The footer provides integrity
+ (`binary_sha256`), not authenticity. Any deployment-time signature
+ flow that wants to attest "this bundle was produced by entity X"
+ (macOS `codesign` / `rcodesign`, Linux IMA, fs-verity policy) is
+ layered on top of the bundle file and is out of scope of this ADR.
+ A signed footer field could be added in a future `footer_ver` if
+ the SDK ever needs to ship its own trust anchor, but doing so now
+ would force key-management decisions Airflow does not currently
+ need to make.
+- **Authenticode-signed Windows bundles.** The footer-after-EOF
+ layout runs fine on Windows for *execution* — the OS loader stops
+ at the PE size-of-image, identical to ELF/Mach-O behaviour, and
+ `binary_sha256` covers the binary region the same way. What does
+ not work is layering Authenticode on top: PE Authenticode stores
+ its signature in the certificate table referenced from the
+ Optional Header, and Microsoft's `EnableCertPaddingCheck`
+ hardening (MS13-098) rejects extra bytes past the signature, so
+ `signtool` against an appended bundle (in either order) produces
+ a binary that strict-mode verification rejects. Deployment
+ Managers who need Authenticode-signed bundles on Windows must use
+ a different on-disk layout (storing source and manifest in a
+ dedicated PE resource section so they are inside the signed
+ image); that layout is tracked separately. The bundle format
+ itself supports Windows as a target.
+- Multiple source files. Only the root file (the file containing
+ `func main()`) is embedded. DAGs split across multiple source
+ files keep the rest of their sources outside the bundle; the UI
+ source-view shows only the entry file. Revisit if user feedback
+ requests broader source visibility.
+- Compression of the source/metadata blobs. Both are tiny (kilobytes)
+ next to the binary; deflating them adds reader complexity for no
+ measurable space win.
+
+## Implementation notes
+
+- The append step is `os.OpenFile(out, O_RDWR|O_APPEND, 0)` plus three
+ writes (source, metadata, trailer) followed by `Close`. No mmap
+ needed. The packer streams the binary region through `sha256.New()`
+ once before the append to compute `binary_sha256`; the digest is
+ the only state needed across the read and write phases.
+- The executable bit on the output file is set by `go build` itself.
+ The append step preserves it (we write through, not truncate).
+- The packer's existing reproducibility guarantees (sorted entries,
+ fixed mtimes) reduce to "write a deterministic YAML manifest"; the
+ ZIP-specific concerns (entry ordering, entry mtimes, external
+ attributes) go away. `binary_sha256` is deterministic by
+ construction.
+- The Python-side scanner's bundle-detection helper lives next to
+ `BundleScanner`; it reads 64 bytes per file, parses the trailer,
+ then streams the binary region through `hashlib.sha256` to verify
+ `binary_sha256`. The verification result is cached by
+ `(path, inode, mtime, size)` so the runtime exec path does not
+ re-hash. Keep the helper tolerant of trailing whitespace or short
+ files (anything `< 64` bytes is not a bundle).