uranusjr commented on code in PR #68223: URL: https://github.com/apache/airflow/pull/68223#discussion_r3377867815
########## airflow-core/docs/authoring-and-scheduling/language-sdks/go.rst: ########## @@ -0,0 +1,433 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. _go-sdk: + +Go SDK +====== + +|experimental| + +The Go SDK lets you implement Airflow task logic in Go, with native access to the Airflow "model" +(Variables, Connections, and XCom). The Dag and its scheduling remain in Python; individual tasks delegate +to a compiled Go *bundle* that is launched by +:class:`~airflow.sdk.coordinators.executable.ExecutableCoordinator` for each task instance. + +Because Go is a compiled language, every task must be compiled ahead of time and registered inside a single, +self-contained native executable called a **bundle**. The bundle also embeds its Dag source and a metadata +manifest (the ``dag_id`` and ``task_id`` map) in a footer appended to the executable, so the executable *is* +the bundle: one runnable file to ship, with no separate manifest or archive. The +:ref:`airflow-go-pack <go-sdk/build>` tool builds and packs that bundle. + +.. contents:: Contents + :local: + :depth: 2 + +Prerequisites +------------- + +* Go 1.24 or later to build and pack bundles. This is a build-time requirement only; the worker that runs a + packed bundle needs no Go toolchain, because the bundle is a self-contained native executable. +* The packed bundle must be accessible from the Airflow worker, under a directory the coordinator scans. +* The ``apache-airflow-task-sdk`` package (installed with Airflow) provides the coordinator; no additional + Python packages are needed. + +Deployment modes +---------------- + +A packed bundle can run in two ways. The same binary works in both, and you pick one per deployment: + +* **Coordinator (recommended).** A Python task runner launches the Go bundle directly, with no separate Go + worker process on the host. This is the same coordinator mechanism the Java SDK uses. Because the mature + Python supervisor handles the Airflow-facing concerns, this path inherits remote task logs (S3/GCS), the + full range of task states, and alternate XCom backends, rather than implementing them again in Go. Those are + exactly the features the Edge Worker path is still missing. +* **Edge Worker.** A long-running Go process (``airflow-go-edge-worker``) polls Airflow for work and runs + your bundle, with no Python in the data path. It runs end-to-end today but is missing the features listed + under :ref:`go-sdk/limitations`. + +The rest of this guide covers the recommended coordinator path; see :ref:`go-sdk/edge-worker` for a summary +of the Edge Worker. + +Quick start +----------- + +The following example shows the minimal moving parts: a Python Dag with two stub tasks, and a Go +implementation of those tasks. + +Python Dag (the scheduling side) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: python + + from airflow.sdk import dag, task + + + @dag + def simple_dag(): + @task.stub(queue="golang") + def extract(): ... + + @task.stub(queue="golang") + def transform(): ... + + extract() >> transform() + + + simple_dag() + +``@task.stub`` declares the *shape* of the Go tasks (their names and dependencies) without any Python +implementation. The ``queue`` value routes the task to the Go coordinator. + +Go implementation +~~~~~~~~~~~~~~~~~~ + +A task is an ordinary Go function. The runtime inspects its signature and injects arguments by type, so each +task declares only the parameters it needs. + +.. code-block:: go + + import ( + "context" + "log/slog" + "runtime" + + "github.com/apache/airflow/go-sdk/sdk" + ) + + func extract(ctx context.Context, client sdk.Client, log *slog.Logger) (any, error) { + conn, err := client.GetConnection(ctx, "test_http") + if err != nil { + return nil, err + } + log.Info("fetched connection", "host", conn.Host) + // ... do work, honour ctx cancellation ... + return map[string]any{"go_version": runtime.Version()}, nil + } + + func transform(ctx context.Context, client sdk.VariableClient, log *slog.Logger) error { + val, err := client.GetVariable(ctx, "my_variable") + if err != nil { + return err + } + log.Info("obtained variable", "my_variable", val) + return nil + } + +.. note:: + + As with the other language SDKs, XCom *dependencies* are declared in the Python stub Dag (they define task + order). The value must still be read explicitly in Go via ``client.GetXCom``, and produced either by the Review Comment: Is it possible to use dynamic arguments like Java’s annotation syntax? Since Go already allows returning a value to automatically push, it would be best if it also supports accepting an argument to automatically pull. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
