potiuk commented on issue #31105: URL: https://github.com/apache/airflow/issues/31105#issuecomment-1605387471
Yeah. Agree with @getaaron. This one is not **that** eeasy because of interleaving logs from different sources. I looked at it and really what you would have to do is two do either of those: * change the reading method to accept ranges of position and knowing log position to start in all of them would be impossible (but it would be indeed complex when it comes to matching the position with the different sources) or (probably easier) * instead of storing the logs in in-memory lists, stream them to temporary files and read them from there (and then indeed k-way merge would be better Maybe a better solution will be to introduce some hard-limits on the size of logs that you can get to memory and "hard-stop" if any of the sources will attempt to get logs bigger than the size - returning "log too large to show" instead?. Then the thing to add is an optional max_lenght or max_size that could be passed to the methods returning arrays and implement it in all the implementations to raise a specific exception if the returned array would be too big. This is not really usable to swift through the 400 MB log in Airflow UI. And I don't think we have good mechanism to keep such log in memory of the browser, so I am not even sure if we could show such log at all in Airflow UI even if the backend could handle it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
