VladaZakharova commented on PR #44880:
URL: https://github.com/apache/airflow/pull/44880#issuecomment-2541615989

   > > Following Ash's comment - maybe it would be better to utilize a 
flag/configuration to enable it?
   > 
   > Yeah. Configuration to enable it would be better. I think the stackrace 
where the task was killed is not too useful - really - I am not even sure if it 
will actually show the place where the process is. Signals are always delivered 
to the main thread (which is one limitation), and I am not even sure if the 
stacktrace in this case will be showing where the thread was "in" before.
   > 
   > Do you have some examples of such stack-traces generated with it that 
looks like "useful" @VladaZakharova ?
   
   I think something like this can be useful:
   
   ```
   from datetime import timedelta
   import time
   
   import airflow
   from providers.src.airflow.providers.standard.operators.python import 
PythonOperator
   
   
   with airflow.DAG(
       "trace_import_timeout",
       start_date=datetime(2022, 1, 1),
       schedule=None) as dag:
       def f():
           print("Sleeping")
           time.sleep(3660)
   
   
       for ind in range(2):
           PythonOperator(
               dag=dag,
               task_id=f"sleep_120_{ind}",
               python_callable=f,
           )
   ```
   
   And the output in the logs will look like this:
   ```
   [2024-12-12, 14:07:20 UTC] {taskinstance.py:2813} ERROR - Received SIGTERM. 
Terminating subprocesses.
   [2024-12-12, 14:07:20 UTC] {taskinstance.py:2814} ERROR - Stacktrace: 
     File "/usr/local/bin/***", line 8, in <module>
       sys.exit(main())
     File "/opt/***/***/__main__.py", line 58, in main
       args.func(args)
     File "/opt/***/***/cli/cli_config.py", line 49, in command
       return func(*args, **kwargs)
     File "/opt/***/***/utils/cli.py", line 111, in wrapper
       return f(*args, **kwargs)
   ...
       return func(*args, **kwargs)
     File "/files/dags/core/example_logs_trace.py", line 14, in f
       time.sleep(3660)
     File "/opt/***/***/models/taskinstance.py", line 2814, in signal_handler
       self.log.error("Stacktrace: \n%s", "".join(traceback.format_stack()))
   ```
   Which shows what command it was executing when the SIGTERM happened. 
   
   Also we already have this output if the task failed due to timeout:
   ```
   [2024-12-12, 13:55:41 UTC] {timeout.py:68} ERROR - Process timed out, PID: 
1255
   [2024-12-12, 13:55:41 UTC] {taskinstance.py:3041} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File "/opt/airflow/airflow/models/taskinstance.py", line 743, in 
_execute_task
       result = _execute_callable(context=context, **execute_callable_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/opt/airflow/airflow/models/taskinstance.py", line 714, in 
_execute_callable
       return ExecutionCallableRunner(
              ^^^^^^^^^^^^^^^^^^^^^^^^
     File "/opt/airflow/airflow/utils/operator_helpers.py", line 269, in run
       return func(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^
     File "/opt/airflow/airflow/models/baseoperator.py", line 378, in wrapper
       return func(self, *args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/opt/airflow/providers/src/airflow/providers/standard/operators/python.py", 
line 195, in execute
       return_value = self.execute_callable()
                      ^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/opt/airflow/providers/src/airflow/providers/standard/operators/python.py", 
line 221, in execute_callable
       return runner.run(*self.op_args, **self.op_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/opt/airflow/airflow/utils/operator_helpers.py", line 269, in run
       return func(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^
     File "/files/dags/core/example_logs_trace.py", line 14, in f
       time.sleep(3660)
     File "/opt/airflow/airflow/utils/timeout.py", line 69, in handle_timeout
       raise AirflowTaskTimeout(self.error_message)
   airflow.exceptions.AirflowTaskTimeout: Timeout, PID: 1255
   ```
   
   Maybe it makes sense to make the output the same short as when we have a 
timeout error. We still can see in this example that it outputs the place where 
it failed
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to