As Airflow worker always picks up the latest config on each restart, as a
result, scheduler and worker could be running with different configs.
We therefore thought Airflow was designed to handle non-transactional
config updates, but I agree with what you said, a complete code audit seems
to be a huge overhead.

Thank you both, we'll forgo this idea.

On Fri, Dec 8, 2017 at 8:21 AM, Grant Nicholas <
[email protected]> wrote:

> While I think it's a good idea in theory, +1 with Daniel on this as the
> complexity is actually quite high.
>
> 1. It would require a complete code audit/refactoring for all config access
> to make sure no one is caching values. To me that seems very hard to do in
> an effective way.
> 2. Some config values can be dynamically reloaded safely while others
> cannot. We would get race conditions for those reloaded configs and it
> would be hard to hunt down and document all the edge cases.
>
> Focusing on making scheduler restarts safe and fast would (mostly) get rid
> of the need for dynamic config reloads.
>
> On Fri, Dec 8, 2017 at 9:07 AM, Daniel Imberman <[email protected]
> >
> wrote:
>
> > At least for the k8s executor it would be easier to just restart the
> > scheduler pod. I can see some potential issues if some configs are stored
> > in local variables.
> > On Wed, Dec 6, 2017 at 11:45 AM Feng Lu <[email protected]>
> wrote:
> >
> > > Hi,
> > >
> > > It's probably well-known that Airflow only loads config file (i.e.,
> > > airflow.cfg) at instance creation time, if one needs to change the
> config
> > > file, all Airflow instances have to be restarted (understand that
> Airflow
> > > worker does restart itself for each task execution and therefore picks
> up
> > > the latest config updates).
> > >
> > > Are people from this mailing group interested in adding dynamic config
> > > loading support inside AirflowConfigParser (
> > >
> > > https://github.com/apache/incubator-airflow/blob/master/
> > airflow/configuration.py#L110
> > > )?
> > >
> > > Initial implementation ideas:
> > > - introduce threadling.Rlock to AirflowConfigParser and guard all
> method
> > > access
> > > - add a periodical timer task that reads in the config file (of course
> > > needs to acquire the lock beforehand).
> > >
> > > Since config data is always accessed via the AirflowConfigParser
> object,
> > > this essentially gives us dynamic config update without restarting
> > Airflow
> > > scheduler/webserver.
> > >
> > > Thoughts?
> > > Thank you.
> > >
> > > Feng
> > >
> >
>

Reply via email to