dstandish opened a new pull request, #38989:
URL: https://github.com/apache/airflow/pull/38989

   For security reasons, we don't present the user with tracebacks when there's 
a webserver error.  If we similarly don't want to provide tracebacks in task 
execution logs, we could provide a UUID that an admin can use to find the error 
in the server logs.
   
   I'm not 100% sure that we need to hide this from user in this context.  
Because the dag writer uses the task API in a way that's different from how 
they use the webserver.
   
   But... my guess is the same logic would apply.  WDYT?
   
   With this PR, here's what the task logs look like:
   ```
   [2024-04-13, 16:53:53 UTC] {standard_task_runner.py:112} ERROR - Failed to 
execute job 29 for task d_1_source (Got 500:INTERNAL SERVER ERROR when sending 
the internal api request: Error executing method 
'airflow.models.taskinstance.TaskInstance.save_to_db'; 
error_id=88463b9d-4280-47b4-94a4-94836ce1da2d; 153)
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/task/task_runner/standard_task_runner.py",
 line 105, in _start_by_fork
       ret = args.func(args, dag=self.dag)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/cli_config.py", 
line 49, in command
       return func(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/cli.py", line 
115, in wrapper
       return f(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/commands/task_command.py",
 line 476, in task_run
       task_return_code = _run_task_by_selected_method(args, _dag, ti)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/commands/task_command.py",
 line 253, in _run_task_by_selected_method
       return _run_raw_task(args, ti)
              ^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/commands/task_command.py",
 line 335, in _run_raw_task
       return ti._run_raw_task(
              ^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/serialization/pydantic/taskinstance.py",
 line 138, in _run_raw_task
       _run_raw_task_internal(
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py",
 line 252, in _run_raw_task_internal
       TaskInstance.save_to_db(ti=ti, session=session)
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/api_internal/internal_api_call.py",
 line 141, in wrapper
       result = make_jsonrpc_request(method_name, args_dict)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 
289, in wrapped_f
       return self(f, *args, **kw)
              ^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 
379, in __call__
       do = self.iter(retry_state=retry_state)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 
314, in iter
       return fut.result()
              ^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in 
result
       return self.__get_result()
              ^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in 
__get_result
       raise self._exception
     File 
"/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 
382, in __call__
       result = fn(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/api_internal/internal_api_call.py",
 line 118, in make_jsonrpc_request
       raise AirflowException(
   airflow.exceptions.AirflowException: Got 500:INTERNAL SERVER ERROR when 
sending the internal api request: Error executing method 
'airflow.models.taskinstance.TaskInstance.save_to_db'; 
error_id=88463b9d-4280-47b4-94a4-94836ce1da2d
   ```
   Notice that the client side traceback is shown but the server side is not.  
And that's already the case.  But now I've added a `error_id` UUID that can be 
used to trace.
   
   And here's what you see in the server logs
   
   ```
   [2024-04-13T16:53:53.857+0000] {rpc_api_endpoint.py:153} ERROR - Error 
executing method 'airflow.models.taskinstance.TaskInstance.save_to_db'; 
error_id=88463b9d-4280-47b4-94a4-94836ce1da2d.
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/api_internal/endpoints/rpc_api_endpoint.py",
 line 147, in internal_airflow_api
       output = handler(**params, session=session)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/api_internal/internal_api_call.py",
 line 128, in wrapper
       return func(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", 
line 81, in wrapper
       return func(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py",
 line 3251, in save_to_db
       ti = _coalesce_to_orm_ti(ti=ti, session=session)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py",
 line 1509, in _coalesce_to_orm_ti
       raise NotImplementedError
   NotImplementedError
   ```
   
   Now, as an airflow developer, it would be more convenient if we just 
returned the traceback in the 500 response.  If there's actually no securtiy 
concern here, then that would be the way to go.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to