potiuk commented on issue #31105:
URL: https://github.com/apache/airflow/issues/31105#issuecomment-1605387471

   Yeah. Agree with @getaaron.
   
   This one is not **that** eeasy because of interleaving logs from different 
sources. I looked at it and really what you would have to do is two do either 
of those:
   
   * change the reading  method to accept ranges of position and knowing log 
position to start in all of them would be impossible (but it would be indeed 
complex when it comes to matching the position with the different sources)
   
   or (probably easier)
   
   * instead of storing the logs in in-memory lists, stream them to temporary 
files and read them from there (and then indeed k-way merge would be better
   
   Maybe a better solution will be to introduce some hard-limits on the size of 
logs that you can get to memory and "hard-stop" if any of the sources will 
attempt to get logs bigger than the size - returning "log too large to show" 
instead?. Then the thing to add is an optional max_lenght or max_size that 
could be passed to the methods returning arrays and implement it in all the 
implementations to raise a specific exception if the returned array would be 
too big.
   
   This is not really usable to swift through the 400 MB log in Airflow UI. And 
I don't think we have good mechanism to keep such log in memory of the browser, 
so I am not even sure if we could show such log at all in Airflow UI even if 
the backend could handle it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to