ConstantinoSchillebeeckx opened a new issue #15766:
URL: https://github.com/apache/airflow/issues/15766


   
   **Apache Airflow version**: 2.0.2
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: AWS
   
   **What happened**:
   
   I have a custom operator that reads a SQL file and simply executes it using 
the MySQL hook; the operator works just fine though we're having an issue (not 
related to Airflow) with one of the queries that will intermittently take very 
long. As a temporary work around, we're setting the `execution_timeout` to a 
small number (10s) and then using the `on_retry_callback` to do cleanup.
   
   When this process works as expected, our log looks like:
   ```
   [2021-05-08 19:48:20,771] {base_task_runner.py:62} DEBUG - Planning to run 
as the  user
   ...
   ...
   [2021-05-08 19:48:35,385] {timeout.py:36} ERROR - Process timed out, PID: 
10082
   [2021-05-08 19:48:35,385] {taskinstance.py:595} DEBUG - Refreshing 
TaskInstance <TaskInstance: [REDACTED] 2021-05-08T01:01:00+00:00 [running]> 
from DB
   [2021-05-08 19:48:35,391] {taskinstance.py:630} DEBUG - Refreshed 
TaskInstance <TaskInstance: [REDACTED] 2021-05-08T01:01:00+00:00 [running]>
   [2021-05-08 19:48:35,392] {taskinstance.py:1482} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1138, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1311, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1336, in _execute_task
       result = task_copy.execute(context=context)
     File "/usr/local/airflow/pipelines/operators/sql_file_operator.py", line 
45, in execute
       self.writer._hook.run(self.transform)
     File "/usr/local/lib/python3.7/site-packages/airflow/hooks/dbapi.py", line 
184, in run
       cur.execute(sql_statement)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
255, in execute
       self.errorhandler(self, exc, value)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
50, in defaulterrorhandler
       raise errorvalue
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
252, in execute
       res = self._query(query)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
378, in _query
       db.query(q)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
280, in query
       _mysql.connection.query(self, query)
     File "/usr/local/lib/python3.7/site-packages/airflow/utils/timeout.py", 
line 37, in handle_timeout
       raise AirflowTaskTimeout(self.error_message)
   airflow.exceptions.AirflowTaskTimeout: Timeout, PID: 10082
   [2021-05-08 19:48:35,393] {taskinstance.py:1891} DEBUG - Task Duration set 
to 14.59135
   ```
   * notice the task duration (15s) is close to our timeout duration (10s)
   
   However sometimes we're noticing the timeout doesn't seem to be killing the 
task on time:
   ```
   [2021-05-10 01:49:33,536] {base_task_runner.py:62} DEBUG - Planning to run 
as the  user
   ...
   ...
   [2021-05-10 05:08:40,874] {timeout.py:36} ERROR - Process timed out, PID: 
15864
   [2021-05-10 05:08:40,874] {taskinstance.py:595} DEBUG - Refreshing 
TaskInstance <TaskInstance:  [REDACTED] 2021-05-09T01:01:00+00:00 [running]> 
from DB
   [2021-05-10 05:08:40,880] {taskinstance.py:630} DEBUG - Refreshed 
TaskInstance <TaskInstance: [REDACTED] 2021-05-09T01:01:00+00:00 [running]>
   [2021-05-10 05:08:40,881] {taskinstance.py:1482} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1138, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1311, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1336, in _execute_task
       result = task_copy.execute(context=context)
     File "/usr/local/airflow/pipelines/operators/sql_file_operator.py", line 
45, in execute
       self.writer._hook.run(self.transform)
     File "/usr/local/lib/python3.7/site-packages/airflow/hooks/dbapi.py", line 
184, in run
       cur.execute(sql_statement)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
255, in execute
       self.errorhandler(self, exc, value)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
50, in defaulterrorhandler
       raise errorvalue
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
252, in execute
       res = self._query(query)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
378, in _query
       db.query(q)
     File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
280, in query
       _mysql.connection.query(self, query)
     File "/usr/local/lib/python3.7/site-packages/airflow/utils/timeout.py", 
line 37, in handle_timeout
       raise AirflowTaskTimeout(self.error_message)
   airflow.exceptions.AirflowTaskTimeout: Timeout, PID: 15864
   [2021-05-10 05:08:40,882] {taskinstance.py:1891} DEBUG - Task Duration set 
to 33552.050711
   ```
   * notice the task duration (33552s) is not at all close to our timeout 
duration (10s)
   
   **What you expected to happen**:
   
   We expect the task to always get killed at an interval equal to the value 
for `timeout_execution`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to