eladkal commented on code in PR #41022: URL: https://github.com/apache/airflow/pull/41022#discussion_r1691814267
########## docs/apache-airflow/authoring-and-scheduling/datasets.rst: ########## @@ -321,6 +321,34 @@ Example: Note that this example is using `(.values() | first | first) <https://jinja.palletsprojects.com/en/3.1.x/templates/#jinja-filters.first>`_ to fetch the first of one dataset given to the DAG, and the first of one DatasetEvent for that dataset. An implementation can be quite complex if you have multiple datasets, potentially with multiple DatasetEvents. +Manipulating queued dataset events through REST API +--------------------------------------------------- + +.. versionadded:: 2.9 + +In this example, the DAG ``waiting_for_dataset_1_and_2`` will be triggered when tasks update both datasets "dataset-1" and "dataset-2". Once "dataset-1" is updated, Airflow creates a record. This ensures that Airflow knows to trigger the DAG when "dataset-2" is updated. We call such records queued dataset events. + +.. code-block:: python + + with DAG( + dag_id="waiting_for_dataset_1_and_2", + schedule=[Dataset("dataset-1"), Dataset("dataset-2")], + ..., + ): + ... Review Comment: What is the added value of this paragraph? It feels like repeating what is previously explain in Multiple Datasets sectionץ I think the section should be just about the API endpoints (BTW is it just API or also CLI?) ########## docs/apache-airflow/authoring-and-scheduling/datasets.rst: ########## @@ -321,6 +321,34 @@ Example: Note that this example is using `(.values() | first | first) <https://jinja.palletsprojects.com/en/3.1.x/templates/#jinja-filters.first>`_ to fetch the first of one dataset given to the DAG, and the first of one DatasetEvent for that dataset. An implementation can be quite complex if you have multiple datasets, potentially with multiple DatasetEvents. +Manipulating queued dataset events through REST API +--------------------------------------------------- + +.. versionadded:: 2.9 + +In this example, the DAG ``waiting_for_dataset_1_and_2`` will be triggered when tasks update both datasets "dataset-1" and "dataset-2". Once "dataset-1" is updated, Airflow creates a record. This ensures that Airflow knows to trigger the DAG when "dataset-2" is updated. We call such records queued dataset events. + +.. code-block:: python + + with DAG( + dag_id="waiting_for_dataset_1_and_2", + schedule=[Dataset("dataset-1"), Dataset("dataset-2")], + ..., + ): + ... + + +``quededEvent`` API endpoints are introduced to manipulate such records. + +* Get a queued Dataset event for a DAG: ``/datasets/queuedEvent/{uri}`` +* Get queued Dataset events for a DAG: ``/dags/{dag_id}/datasets/queuedEvent`` +* Delete a queued Dataset event for a DAG: ``/datasets/queuedEvent/{uri}`` +* Delete queued Dataset events for a DAG: ``/dags/{dag_id}/datasets/queuedEvent`` +* Get queued Dataset events for a Dataset: ``/dags/{dag_id}/datasets/queuedEvent/{uri}`` +* Delete queued Dataset events for a Dataset: ``DELETE /dags/{dag_id}/datasets/queuedEvent/{uri}`` + + For how to use REST API and the parameters needed for these endpoints, please refer to :doc:`Airflow API </stable-rest-api-ref>` Review Comment: ```suggestion To read more about the REST API and the parameters needed for these endpoints, please refer to :doc:`Airflow API </stable-rest-api-ref>` ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
