freget opened a new issue, #27830:
URL: https://github.com/apache/airflow/issues/27830

   ### Apache Airflow Provider(s)
   
   databricks
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-databricks      3.3.0
   
   ### Apache Airflow version
   
   2.4.3
   
   ### Operating System
   
   Debian Bullseye
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   When switching from DatabricksSubmitRunOperator to 
DatabricksSubmitRunDeferrableOperator, fetching AAD tokens seems to run into a 
timeout problem. 
   
   Here is the relevant log extract:
       
       [2022-11-22 11:31:17,821] {triggerer_job.py:359} INFO - Trigger 
<airflow.providers.databricks.triggers.databricks.DatabricksExecutionTrigger 
run_id=17258479, databricks_conn_id=databricks, polling_period_seconds=30> (ID 
47143) starting
       [2022-11-22 11:31:17,936] {base.py:71} INFO - Using connection ID 
'databricks' for task execution.
       [2022-11-22 11:31:19,076] {databricks_base.py:458} INFO - Using AAD 
Token for SPN.
       [2022-11-22 11:31:19,077] {databricks_base.py:288} INFO - Existing AAD 
token is expired, or going to expire soon. Refreshing...
       [2022-11-22 11:31:19,083] {base_events.py:1900} WARNING - Executing 
<Task pending name='Task-9' coro=<TriggerRunner.run_trigger() running at 
/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/triggerer_job.py:361>
 wait_for=<Future pending cb=[shield.<locals>._outer_done_callback() at 
/usr/local/lib/python3.9/asyncio/tasks.py:907, <TaskWakeupMethWrapper object at 
0x7f2f661eeeb0>()] created at 
/usr/local/lib/python3.9/asyncio/base_events.py:429> created at 
/usr/local/lib/python3.9/asyncio/tasks.py:361> took 1.262 seconds
       [2022-11-22 11:31:19,086] {triggerer_job.py:344} ERROR - Triggerer's 
async thread was blocked for 1.28 seconds, likely by a badly-written trigger. 
Set PYTHONASYNCIODEBUG=1 to get more information on overrunning coroutines.
       [2022-11-22 11:31:30,111] {triggerer_job.py:306} ERROR - Trigger 
<airflow.providers.databricks.triggers.databricks.DatabricksExecutionTrigger 
run_id=17258479, databricks_conn_id=databricks, polling_period_seconds=30> (ID 
47143) exited with error 
       Traceback (most recent call last):
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/triggerer_job.py",
 line 297, in cleanup_finished_triggers
       result = details["task"].result()
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/triggerer_job.py",
 line 361, in run_trigger
       async for event in trigger.run():
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/databricks/triggers/databricks.py",
 line 66, in run
       run_page_url = await self.hook.a_get_run_page_url(self.run_id)
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/databricks/hooks/databricks.py",
 line 221, in a_get_run_page_url
       response = await self._a_do_api_call(GET_RUN_ENDPOINT, json)
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/databricks/hooks/databricks_base.py",
 line 555, in _a_do_api_call
       token = await self._a_get_token()
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/databricks/hooks/databricks_base.py",
 line 459, in _a_get_token
       return await self._a_get_aad_token(DEFAULT_DATABRICKS_SCOPE)
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/databricks/hooks/databricks_base.py",
 line 290, in _a_get_aad_token
       async for attempt in self._a_get_retry_object():
       File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/_asyncio.py", line 
69, in __anext__
       do = self.iter(retry_state=self._retry_state)
       File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
351, in iter
         return fut.result()
       File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, 
in result
         return self.__get_result()
       File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, 
in __get_result
         raise self._exception
       File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/databricks/hooks/databricks_base.py",
 line 316, in _a_get_aad_token
         async with self._session.post(
       File 
"/home/airflow/.local/lib/python3.9/site-packages/aiohttp/client.py", line 
1141, in __aenter__
         self._resp = await self._coro
       File 
"/home/airflow/.local/lib/python3.9/site-packages/aiohttp/client.py", line 637, 
in _request
         break
       File 
"/home/airflow/.local/lib/python3.9/site-packages/aiohttp/helpers.py", line 
720, in __exit__
         raise asyncio.TimeoutError from None
       asyncio.exceptions.TimeoutError
       [2022-11-22 11:26:28,730] {triggerer_job.py:317} ERROR - Trigger 
<airflow.providers.databricks.triggers.databricks.DatabricksExecutionTrigger 
run_id=17258469, databricks_conn_id=databricks, polling_period_seconds=30> (ID 
47142) exited without sending an event. Dependent tasks will be failed.
   
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   Create a databricks connection agains a Azure Databricks Workspace with 
authentication configured using service principal components.
   Then, when creating a DAG with a DatabricksSubmitRunDeferrableOperator, the 
error should arise (at least if your AD is as slow as ours to respond).
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to