walter9388 opened a new issue, #45554: URL: https://github.com/apache/airflow/issues/45554
### Apache Airflow version Other Airflow 2 version (please specify below) ### If "Other Airflow 2 version" selected, which one? 2.10.1 ### What happened? When using CloudWatch logging there seems to be a 60 second time delay between the logging output updating. Please see the video below and observe: 1. Initially Airflow can't find the remote logs (as there are none). 2. Airflow detects local logs. 3. Nothing happens in the UI for 60 seconds. 4. After 60 seconds logging appears, and the top of the printout states it is from CloudWatch logs. 5. It then periodically updates the logs every 60 seconds after until the task is completed. _Please skip ahead in the video as most of it is static!_ https://github.com/user-attachments/assets/d3abd82b-793a-475a-9488-746d640573c7 As a second minor point, you can also see that grouping now longer works with the logs read from CloudWatch. However, this doesn't concern me as much. ### What you think should happen instead? I'm not sure if I have configured something incorrectly, but I expected the same behaviour as local logging, i.e. tailing of the log file. I struggled to find the default behaviour documented, but what I expect to happen was that Airflow would use the local logs if they were available and only use the remote logs if no local logs were found. I found this logic in previous [documentation (<2.0)](https://airflow.apache.org/docs/apache-airflow/1.10.8/howto/write-logs.html), although this may now be outdated: > In the Airflow Web UI, remote logs take precedence over local logs when remote logging is enabled. If remote logs can not be found or accessed, local logs will be displayed. Note that logs are only sent to remote storage once a task is complete (including failure); In other words, remote logs for running tasks are unavailable (but local logs are available). Can you confirm that this is the expected behaviour and what is in the video above is a bug? Alternatively, I can see in the browser that a request is made every second to update the logs, and I can confirm that the logs are only being written to CloudWatch every 60 seconds or when the task is complete. Is this expected behaviour? or should logs be written to cloudwatch at a higher rate? If the behaviour in the video is actually what is expected, I would like to suggest one of the following options as we need a <60 second refresh window in our logging setup: 1. A configuration variable to use local logging first if available (e.g. `local_logging_prefer = True`). 2. A configuration variable for the update frequency of the logging when using remote logging (e.g. `remote_logging_refresh_period = 60`). Let me know your thoughts. ### How to reproduce The remote logging config was copied from [here](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/logging/cloud-watch-task-handlers.html): ``` [logging] # Airflow can store logs remotely in AWS Cloudwatch. Users must supply a log group # ARN (starting with 'cloudwatch://...') and an Airflow connection # id that provides write and read access to the log location. remote_logging = True remote_base_log_folder = cloudwatch://arn:aws:logs:<region name>:<account id>:log-group:<group name> remote_log_conn_id = MyCloudwatchConn ``` The demo DAG used in the video above prints to logging every 10 seconds and is as follows: ```python import logging from datetime import datetime from time import sleep from airflow.models import DAG from airflow.operators.python import task with DAG( dag_id="dev__cloudwatch_logging_testing", start_date=datetime(2024, 1, 1), schedule=None, ): @task def task1(): sleeptime = 10 for i in range(0, 300, sleeptime): logging.info(i) sleep(sleeptime) task1() ``` ### Operating System NAME="Ubuntu" VERSION="20.04.6 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.6 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal ### Versions of Apache Airflow Providers ``` apache-airflow==2.10.1 apache-airflow-providers-amazon==8.28.0 apache-airflow-providers-celery==3.8.1 apache-airflow-providers-cncf-kubernetes==8.4.1 apache-airflow-providers-common-compat==1.2.0 apache-airflow-providers-common-io==1.4.0 apache-airflow-providers-common-sql==1.16.0 apache-airflow-providers-docker==3.13.0 apache-airflow-providers-elasticsearch==5.5.0 apache-airflow-providers-fab==1.3.0 apache-airflow-providers-ftp==3.11.0 apache-airflow-providers-google==10.22.0 apache-airflow-providers-grpc==3.6.0 apache-airflow-providers-hashicorp==3.8.0 apache-airflow-providers-http==4.13.0 apache-airflow-providers-imap==3.7.0 apache-airflow-providers-microsoft-azure==10.4.0 apache-airflow-providers-mysql==5.7.0 apache-airflow-providers-odbc==4.7.0 apache-airflow-providers-openlineage==1.11.0 apache-airflow-providers-postgres==5.12.0 apache-airflow-providers-redis==3.8.0 apache-airflow-providers-sendgrid==3.6.0 apache-airflow-providers-sftp==4.11.0 apache-airflow-providers-slack==8.9.0 apache-airflow-providers-smtp==1.8.0 apache-airflow-providers-snowflake==5.7.0 apache-airflow-providers-sqlite==3.9.0 apache-airflow-providers-ssh==3.13.1 ``` ### Deployment Other Docker-based deployment ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
