george-zubrienko commented on issue #21387: URL: https://github.com/apache/airflow/issues/21387#issuecomment-1040270851
> Note that in the future we are likely to integrate open-telemetry for logging (there is a work in-progress on that) and that will allow to stream logs to any external or custom open-telemetry-compatible log sink in real time. This is the ultimate goal. I think this is definitely a way to go, and no PR I could offer will be better. For now, I've resolved our issue by adjusting the setup a bit: - setup logging to a persistent volume, so webserver can discover "local" logs while a task is running - setup remote logging, so once a task is done, log is shipped to remote storage - set a job to clean up PV regularly, since local logs are of no use. A note on this one, we actually disabled sidecar log groomer, because a) sidecar container, b) with >1 scheduler replica, we have >1 log groomer running `find ...` on the whole PV, which is really unnecessary, plus they are racing against each other. This way we have realtime logs served from the PV (fileshare) and logs from done tasks are read from remote storage, which is also a cheaper setup since read transaction cost is lower on blob file storage. Let me know if I should close this issue and link it to the one where `open-telemetry` implementation is tracked! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
