AutomationDev85 opened a new pull request, #57531: URL: https://github.com/apache/airflow/pull/57531
# Overview This PR moves the logic for reading pod logs from the worker to the Triggerer when running in deferred mode. The motivation for this change is to reduce the overhead of bringing a task back to the worker solely for reading log lines. Since the API interaction for reading logs is asynchronous, integrating this logic into the Triggerer should not introduce significant runtime overhead. Previously, moving execution from the Triggerer to the worker resulted in many log lines related to TriggerEvent handling and task entry, which cluttered the logs and degraded the user experience for those primarily interested in business-related log output. The new implementation focuses on displaying only the pod logs, with minimal overhead at pod completion. Additionally, we introduced a feature to skip log lines from the current second during each log read. The Kubernetes API only allows specifying since_seconds (not an exact timestamp), so if log reading starts at, for example, 02:30:500, the log lines from the remaining milliseconds of that second would be read again in the next cycle. By skipping the current second, we avoid duplicate log lines in subsequent reads. This PR builds on changes from the preparation PR https://github.com/apache/airflow/pull/56875, making this update easier to review once that is merged. Note that https://github.com/apache/airflow/pull/56872 is a prerequisite, as it grants the Triggerer the necessary permissions to access pod events. As this is a significant update, we welcome feedback from the community! # Details of change: * Enable Triggerer to read pod logs and write into airflow log * Removed the "running" event from both KubernetesPodTriggerer and KubernetesPodOperator. * Skipped reading log lines from the current second to eliminate duplicate pod log entries in Airflow logs. * Updated pytest suites to reflect these changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
