Hi Akash, Thanks for starting the discussion.
Adding a bit to what Jens said here, a lot of the logging handler might change for AF 3. Although what you are looking at seems very similar to what fluentbit / fluentd offers, have you explored that? FluentBit: https://fluentbit.io/ FluentD: https://www.fluentd.org/ They also have capability for docker ecosystems (needn't have to be the traditional K8s ecosystem). Check here: https://www.fluentd.org/guides/recipes/docker-logging A question about "which" component are you adding the handler in? I think we would probably benefit from a separate service to do this that probably shares a "common" volume with the AF components? (Although look out for the removal of direct DB access in AF 3) Thanks & Regards, Amogh Desai On Sun, Mar 9, 2025 at 12:53 AM Akash Sharma <2akash111...@gmail.com> wrote: > Hi Jens, > The point here is that the solution should be setup agnostic i.e whether > the tasks are being run in CeleryExecutor, K8sExecutor, or > CeleryK8sOperator etc.. or whether the executors are reachable by the web > server or not. > > Best regards, > Akash > > On Sat, Mar 8, 2025 at 11:14 PM Jens Scheffler <j_scheff...@gmx.de.invalid > > > wrote: > > > Hi Akash, > > > > so for remote logging logs still can be sourced from the worker via the > > web server if the endpoint hosted for this is reachable. Web server > > attempts to source the logs from worker or local file system if not > > found on remote. This is the standard for Celery for example. > > Alternatively a shared log file system can be used and the webserver can > > provide logs from there. > > > > In Airflow 3 (soon) there will be an enhanced way to ship logs while > > in-flight. > > > > Otherwise if you hsot your workers remote and you don't get a network > > connection from webserver to your worker, then you can take a look to > > the new Edge Worker which also streams logs in chunks from the edge site > > to the central location. > > > > If you otherwise want to contribute, helping hands are always welcome. > > The log handler structure though will probably change in Airflow 3 soon. > > Limitations of remote log storages for S3 / Azure Blob apply that you > > can not append chunks. > > > > Jens > > > > On 08.03.25 16:49, Akash Sharma wrote: > > > Hello everyone, > > > > > > Whenever remote logging is enabled, logs are only uploaded to the > target > > > path once the tasks have been completed. This makes it harder to > monitor > > > tasks that are long-running since there is no means to getting the > logs. > > > > > > I was working on a Handler that saves the chunked logs where chunking > is > > > decided based on two factors - > > > > > > 1. Max time has elapsed since the last chunking was done > > > 2. Max bytes have arrived since the last chunking was done > > > > > > So a chunk will be saved either when the max time has elapsed or the > file > > > size limit has been surpassed. The chunked files can then be uploaded > > > whenever they are created and served by stitching them back together. > > > > > > Do let me know your thoughts. > > > > > > Best regards, > > > Akash > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >