SameerMesiah97 commented on issue #62392: URL: https://github.com/apache/airflow/issues/62392#issuecomment-3948537073
> Hi, > > Is it possible to include more context about the Key error, such as the error log and what are the things you would expect to fix. From the current context, it looks like the expected behavior is just to raise the error in another way. Probably a proper fix would make those required parameters when the cancel_previous_runs is set to True? > > Also, please follow the contribution guide about how to raise exception for provider case. Thanks https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#dont-raise-airflowexception-directly Here are the logs after running the DAG in the issue: ``` [2026-02-24 02:00:58] INFO - DAG bundles loaded: dags-folder [2026-02-24 02:00:58] INFO - Filling up the DagBag from /files/dags/databricks_cluster_blocked.py [2026-02-24 02:00:58] ERROR - Task failed with exception KeyError: 'job_id' File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/task_runner.py", line 1232 in run File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/task_runner.py", line 1647 in _execute_task File "/opt/airflow/task-sdk/src/airflow/sdk/bases/operator.py", line 443 in wrapper File "/opt/airflow/providers/databricks/src/airflow/providers/databricks/operators/databricks.py", line 924 in execute [2026-02-24 02:00:58] WARNING - No XCom value found; defaulting to None. key=run_page_url dag_id=repro_cancel_previous_runs_keyerror task_id=trigger_keyerror run_id=manual__2026-02-24T02:00:57.771919+00:00 map_index ``` The intent here is simply to raise a more informative exception instead of allowing a raw `KeyError`, so the user can more easily identify the root cause. I did consider validating this during operator construction. However, if you look at `DatabricksRunNowOperator`, `job_id` can come from multiple sources: directly via the `job_id` parameter, resolved at runtime from `job_name`, or injected via the `json` parameter. Since `json` is templated and may be mutated during execution (for example when resolving `job_name` into `json["job_id"]`), validating this reliably at construction time is slightly more complicated than strictly required. The current implementation relies on direct dictionary access (`self.json["job_id"]`), which results in a `KeyError` when `cancel_previous_runs=True` and no job identifier is available. My intention is simply to make that explicit at runtime and raise a provider-level exception instead of raising a `KeyError`. The exception need not be `AirflowException`, and as you said, it's against the agreed upon conventions so I will remove it and replace it with something more appropriate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
