Oduig opened a new issue, #29442:
URL: https://github.com/apache/airflow/issues/29442

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   When I run a DAG on Airflow, python log lines from the DAG are visible as 
expected in the Airflow logs. However, when I use logging inside an 
`on_failure_callback`, the logs are not visible anywhere. We are trying to use 
this logging to diagnose issues with our callbacks, and not having them makes 
this more difficult.
   
   ### What you think should happen instead
   
   I think the log output from `logging.info` should end up somewhere in a log 
system or file.
   
   ### How to reproduce
   
   The following DAG produces only 1 line of log output. 
   
   ```
   import pendulum
   from datetime import timedelta
   from airflow import DAG
   from airflow.operators.bash import BashOperator
   import logging
   import requests
   
   dag_settings = {
       "dag_id": f"INT_tableau_others_recommendation_classifications",
       "max_active_runs": 1,
       "dagrun_timeout": timedelta(minutes=1),
       "start_date": pendulum.today("UTC"),
       "default_args": {
           "owner": "airflow",
           "catchup": False
       },
       "tags": ["env:johndoe"],
       "on_failure_callback": (
           lambda context: [
               logging.info(f"Minimal example - failure callback with context: 
{context}"),
               requests.post(
                   
"https://<replace_this_with_your_requestbin_id>.m.pipedream.net",
                   json={"payload": f"Minimal example - failure callback with 
context: {context}"}
               )
           ]
       )
   }
   dag = DAG(**dag_settings)
   logging.info(f"Minimal example - Created DAG {dag.dag_id}.")
   
   BashOperator(task_id="sleep_a_while", bash_command="sleep 300", dag=dag)
   ```
   
   To find the logs, I checked any and all log output produced by the Airflow 
scheduler, workers, DAG processor, and other services. Here is a screenshot of 
what this looks like, based on the above DAG.
   
   
![image](https://user-images.githubusercontent.com/3661031/217868691-cd98b9f1-7a6d-4e93-ae0a-b45a773fa28d.png)
   
   ### Operating System
   
   composer-2.0.32
   
   ### Versions of Apache Airflow Providers
   
   Here is the full `requirements.txt`
   
   ```
   apache-airflow-providers-databricks==4.0.0
   databricks-sql-connector==2.1.0
   apache-beam~=2.43.0
   sqlalchemy-bigquery==1.5.0
   requests~=2.28.1
   apache-airflow-providers-tableau==4.0.0
   apache-airflow-providers-sendgrid==3.1.0
   python-dotenv==0.21.0
   urllib3~=1.26.8
   tableauserverclient==0.23
   apache-airflow-providers-http==4.1.0
   # time library in airflow
   pendulum==2.1.2
   ```
   
   ### Deployment
   
   Composer
   
   ### Deployment details
   
   We are running a Cloud Compose environment with image 
`composer-2.0.32-airflow-2.3.4`
   
   ### Anything else
   
   In addition to checking the Airflow logs, I also used a remote shell to 
inspect individual `Airflow` containers to find log files on disk, but it seems 
they are all passed to Composer. 
   
   Logs produced by individual tasks (e.g. the output of the bash command 
above) end up in a bucket together with the DAG files - this bucket has a 
`/logs` directory, but it only contains task logs. The task logs for the 
`BashOperator` are empty because bash itself does not produce output, this 
makes sense to me.
   
   Finally, I checked for a `scheduler` folder in the logs directory and cannot 
find any folders or files related to DAG-level callbacks or the scheduler.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to