tstrilka commented on issue #59370: URL: https://github.com/apache/airflow/issues/59370#issuecomment-3655142838
Thanks for the feedback @mobuchowski! You raise a valid point - `on_task_instance_skipped` only covers tasks that actually execute and then raise `AirflowSkipException`. Scheduler-level skips (`BranchPythonOperator`, trigger rules, etc.) happen before the task runner is invoked, so they won't trigger this listener. I looked into the DAG-level events approach you mentioned. The existing `on_dag_run_success`/`on_dag_run_failed` listeners already include `AirflowStateRunFacet` with a complete `tasksState` map that includes ALL task states (including SKIPPED from any source). This does provide comprehensive coverage. That said, I think there's still value in the task-level `on_task_instance_skipped` hook for use cases that need real-time, per-task events rather than waiting for DAG completion. For OpenLineage specifically, this allows emitting a task COMPLETE event immediately when a task self-skips, matching the pattern of `on_task_instance_success`. Would it help if I: 1. Updated the PR description to clarify this only covers `AirflowSkipException` cases? 2. Added documentation noting that comprehensive skip tracking should use DAG-level events? Or would you prefer a different approach entirely? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
