This is an automated email from the ASF dual-hosted git repository.
weilee pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new 129911b2bf add section "Manipulating queued dataset events through
REST API" (#41022)
129911b2bf is described below
commit 129911b2bf9ce4fe1fd3d377e9e1443c27f5ceb7
Author: Wei Lee <[email protected]>
AuthorDate: Mon Jul 29 09:37:37 2024 +0800
add section "Manipulating queued dataset events through REST API" (#41022)
---
.../authoring-and-scheduling/datasets.rst | 28 ++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/docs/apache-airflow/authoring-and-scheduling/datasets.rst
b/docs/apache-airflow/authoring-and-scheduling/datasets.rst
index 25d6fb90d4..53b4ad38cd 100644
--- a/docs/apache-airflow/authoring-and-scheduling/datasets.rst
+++ b/docs/apache-airflow/authoring-and-scheduling/datasets.rst
@@ -321,6 +321,34 @@ Example:
Note that this example is using `(.values() | first | first)
<https://jinja.palletsprojects.com/en/3.1.x/templates/#jinja-filters.first>`_
to fetch the first of one dataset given to the DAG, and the first of one
DatasetEvent for that dataset. An implementation can be quite complex if you
have multiple datasets, potentially with multiple DatasetEvents.
+Manipulating queued dataset events through REST API
+---------------------------------------------------
+
+.. versionadded:: 2.9
+
+In this example, the DAG ``waiting_for_dataset_1_and_2`` will be triggered
when tasks update both datasets "dataset-1" and "dataset-2". Once "dataset-1"
is updated, Airflow creates a record. This ensures that Airflow knows to
trigger the DAG when "dataset-2" is updated. We call such records queued
dataset events.
+
+.. code-block:: python
+
+ with DAG(
+ dag_id="waiting_for_dataset_1_and_2",
+ schedule=[Dataset("dataset-1"), Dataset("dataset-2")],
+ ...,
+ ):
+ ...
+
+
+``quededEvent`` API endpoints are introduced to manipulate such records.
+
+* Get a queued Dataset event for a DAG: ``/datasets/queuedEvent/{uri}``
+* Get queued Dataset events for a DAG: ``/dags/{dag_id}/datasets/queuedEvent``
+* Delete a queued Dataset event for a DAG: ``/datasets/queuedEvent/{uri}``
+* Delete queued Dataset events for a DAG:
``/dags/{dag_id}/datasets/queuedEvent``
+* Get queued Dataset events for a Dataset:
``/dags/{dag_id}/datasets/queuedEvent/{uri}``
+* Delete queued Dataset events for a Dataset: ``DELETE
/dags/{dag_id}/datasets/queuedEvent/{uri}``
+
+ For how to use REST API and the parameters needed for these endpoints, please
refer to :doc:`Airflow API </stable-rest-api-ref>`
+
Advanced dataset scheduling with conditional expressions
--------------------------------------------------------