gabryuri commented on PR #40822: URL: https://github.com/apache/airflow/pull/40822#issuecomment-2231287142
Hi! New contributor here. I created this PR to discuss my approach into this issue: https://github.com/apache/airflow/issues/29275 To explain briefly: this feature aims to add the capability to create healthy DAGs from a file even though other DAGs within the same file might contain errors. In order to create DAGs from a file, the DagFileProcessor instantiates a DagBag, which then processes the file. However, the set of DAGs that will be passed on to be created is stored within the DagContext. My first sketch was to modify it from within the user-created DAG file (along with the new methods added to the DagContext file in this PR) , like so: `#!/usr/bin/env python3 from datetime import datetime import traceback from airflow.decorators import dag, task from airflow.models.dag import DagContext exceptions = [] for dag_id in 'ABC': try: @dag( dag_id=dag_id, schedule=None, start_date=datetime(2023, 1, 31), ) def test_multi_exception(): if dag_id in 'CB': print(f'deleting current dag {DagContext.get_current_dag()}') DagContext.add_current_dag_to_drop_list() raise ValueError #print(f'Failed to create dynamic DAG {dag_id}') @task def t(): print('inside task') t() test_multi_exception() except Exception as e: exceptions.append(''.join(traceback.format_exception(etype=type(e), value=e, tb=e.__traceback__))) if exceptions: raise RuntimeError('\n'.join(exceptions)) if __name__ == "__main__": dag.clear() dag.run()` I even tried dropping them directly instead of adding them to a drop list to then trigger the drop separately, but the code runs in parallel in different processes so I had to adapt to the shown approach. My understanding is that this could be done within the context through the __exit__ part of it, but i'm not acquainted to Airflow's implementation of the DAG decorator and would like to hear some thoughts on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
