SameerMesiah97 commented on issue #62392:
URL: https://github.com/apache/airflow/issues/62392#issuecomment-3948537073

   > Hi,
   > 
   > Is it possible to include more context about the Key error, such as the 
error log and what are the things you would expect to fix. From the current 
context, it looks like the expected behavior is just to raise the error in 
another way. Probably a proper fix would make those required parameters when 
the cancel_previous_runs is set to True?
   > 
   > Also, please follow the contribution guide about how to raise exception 
for provider case. Thanks 
https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#dont-raise-airflowexception-directly
   
   Here are the logs after running the DAG in the issue:
   
   ```
   [2026-02-24 02:00:58] INFO - DAG bundles loaded: dags-folder
   [2026-02-24 02:00:58] INFO - Filling up the DagBag from 
/files/dags/databricks_cluster_blocked.py
   [2026-02-24 02:00:58] ERROR - Task failed with exception
   KeyError: 'job_id'
   File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/task_runner.py", 
line 1232 in run
   
   File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/task_runner.py", 
line 1647 in _execute_task
   
   File "/opt/airflow/task-sdk/src/airflow/sdk/bases/operator.py", line 443 in 
wrapper
   
   File 
"/opt/airflow/providers/databricks/src/airflow/providers/databricks/operators/databricks.py",
 line 924 in execute
   
   [2026-02-24 02:00:58] WARNING - No XCom value found; defaulting to None. 
key=run_page_url dag_id=repro_cancel_previous_runs_keyerror 
task_id=trigger_keyerror run_id=manual__2026-02-24T02:00:57.771919+00:00 
map_index
   ```
   
   The intent here is simply to raise a more informative exception instead of 
allowing a raw `KeyError`, so the user can more easily identify the root cause.
   
   I did consider validating this during operator construction. However, if you 
look at `DatabricksRunNowOperator`, `job_id` can come from multiple sources: 
directly via the `job_id` parameter, resolved at runtime from `job_name`, or 
injected via the `json` parameter. Since `json` is templated and may be mutated 
during execution (for example when resolving `job_name` into `json["job_id"]`), 
validating this reliably at construction time is slightly more complicated than 
strictly required. 
   
   The current implementation relies on direct dictionary access 
(`self.json["job_id"]`), which results in a `KeyError` when 
`cancel_previous_runs=True` and no job identifier is available. My intention is 
simply to make that explicit at runtime and raise a provider-level exception 
instead of raising a `KeyError`.
   
   The exception need not be `AirflowException`, and as you said, it's against 
the agreed upon conventions so I will remove it and replace it with something 
more appropriate.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to