wolfier opened a new issue, #29458:
URL: https://github.com/apache/airflow/issues/29458

   ### Description
   
   I want to filter out logs generated by Airflow workers during the files 
parsing process before task execution. Logs are printed when the file contains 
top level logging statements or module imports that contain logging statements.
   
   ### Use case/motivation
   
   In an effort to reduce cost from log ingestion, I want to be able to 
identify these top level logging statements to filter them. 
   
   
   I am able to determine the logs for a common successful celery task 
execution.
   
   ```
   [2023-02-07 18:48:39,101: INFO/MainProcess] Task 
airflow.executors.celery_executor.execute_command[5589d08f-4b4c-4a1b-9227-7c124abe0d24]
 received
   [2023-02-07 18:48:39,141: INFO/ForkPoolWorker-9] 
[5589d08f-4b4c-4a1b-9227-7c124abe0d24] Executing command in Celery: ['airflow', 
'tasks', 'run', 'ae_cancellation_flows', 
'dbt_run.f_loan_cancellation_requests_co_logs.model.lake_modeling.f_loan_cancellation_requests_co_logs',
 'scheduled__2023-02-07T16:45:00+00:00', '--local', '--subdir', 
'DAGS_FOLDER/lake_modeling/templated_dags.py']
   [2023-02-07 18:48:39,252: INFO/ForkPoolWorker-9] Filling up the DagBag from 
/usr/local/airflow/dags/lake_modeling/templated_dags.py
   [2023-02-07 18:51:16,557: INFO/ForkPoolWorker-9] Running <TaskInstance: 
ae_cancellation_flows.dbt_run.f_loan_cancellation_requests_co_logs.model.lake_modeling.f_loan_cancellation_requests_co_logs
 scheduled__2023-02-07T16:45:00+00:00 [queued]> on host 172.20.19.199
   [2023-02-07 18:51:32,965: INFO/ForkPoolWorker-9] Using connection ID 
'astro_s3_logging' for task execution.
   [2023-02-07 18:51:33,155: INFO/ForkPoolWorker-9] Using connection ID 
'astro_s3_logging' for task execution.
   [2023-02-07 18:51:33,287: INFO/ForkPoolWorker-9] Task 
airflow.executors.celery_executor.execute_command[5589d08f-4b4c-4a1b-9227-7c124abe0d24]
 succeeded in 174.15662495000288s: None
   ```
   
   Users may include arbitrary logging statements that is hard to identify 
there are no distinguish markers.
   
   ```
   [2023-02-07 18:48:56,754: INFO/ForkPoolWorker-9] Model: 
superduperjellymonster
   [2023-02-07 18:48:56,756: INFO/ForkPoolWorker-9] Model: slipperybones
   ```
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to