jason810496 opened a new pull request, #49470: URL: https://github.com/apache/airflow/pull/49470
related issue: #45079 related PR: #45129 related discussion on slack: https://apache-airflow.slack.com/archives/CCZRF2U5A/p1736767159693839 ## Why In short, this PR aims to eliminate OOM issues by: - Replacing full log sorting with a **K-Way Merge** - Making the entire log reading path **streamable** (using `yield` generators instead of returning a list of strings) More detailed reasoning is already described in the linked issue. Due to too many conflicts with the old PR (#45129), this PR reworks the changes on top of the latest `FileTaskHandler`. ## What This PR ports the original changes from #45129 to the current version of `FileTaskHandler` with the following updates: - Fixed line-splitting error when reading in chunks using buffered line-splitting with a carry-over - Adopted the new log metadata structure - Introduced buffering for the log reader ## Note: Recent Changes in `FileTaskHandler` - Introduced `StructuredLogMessage` to represent each log record #46827 - Added `RemoteLogIO` interface for remote log handling #48491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
