It's very reassuring to hear it's been running in production on your side
for a while. This should be a requirement for risky PRs touching the core.
I like the introduction of the new "SCHEDULED" state as well.

Max

On Wed, Jun 1, 2016 at 7:45 AM, Chris Riccomini <[email protected]>
wrote:

> Hey Bolke,
>
> Thanks for being so diligent with this. I think this work is critical for
> the project. Looking forward to a much more stable scheduler.
>
> Cheers,
> Chris
>
> On Wed, Jun 1, 2016 at 3:13 AM, Jeremiah Lowin <[email protected]> wrote:
>
> > Just to be clear this is a highly unlikely event. I used to have a unit
> > test for it but got rid of it when we closed bugs that made it impossible
> > to cause such a crash deterministically. So this situation is possible
> but
> > almost certainly won't manifest.
> >
> > On Wed, Jun 1, 2016 at 4:00 AM Bolke de Bruin <[email protected]> wrote:
> >
> > > Hey,
> > >
> > > This is to give a heads up that I am planning to merge #1514, the
> > refactor
> > > of process_dag, today. This is the second step in executing on the
> > > scheduler roadmap. It has been running in our production for a week now
> > > with no functional differences. Scheduler loop times start a bit
> higher,
> > > but have a lower max. Amount of connections to the database is round
> 1/3
> > of
> > > the previous scheduler (test dag went from 150 connections to 50).
> > Database
> > > load slightly lower.
> > >
> > > While fixing many issues (race conditions), a corner case mentioned by
> > > Jeremiah is now present. A TI is sent in SCHEDULED state to the
> executor.
> > > The executor fails in loading the TI then the TI might be orphaned
> > forever.
> > > As fixing the corner case will require further fundamental changes we
> > > discussed it should be addressed in a follow up patch.
> > >
> > > My planned next steps are 1) reduce scheduler loop time to around 1s by
> > > making task reporting “event driven”. 2) auto-align start date 3) add
> > > notion of “previous” to dagrun 4) fix corner case mentioned above.
> > >
> > > - Bolke
> > >
> > >
> > >
> > >
> >
>

Reply via email to