Hey Bolke,

Thanks for writing this up. I don't have a ton of feedback, as I'm not
terribly familiar with the internals of the scheduler, but two notes:

1. A major +1 for the celery/local executor discussion. IMO, Celery is a
net-negative on this project, and should be fully removed in favor of the
LocalExecutor. Splitting the scheduler from the executor in the
LocalExecutor would basically give parity with Celery, AFAICT, and sounds
much easier to operate to me.
2. If we are moving towards Docker as a container for DAG execution in the
future, it's probably worth considering how these changes are going to
affect the Docker implementation. If we do pursue (1), how does this look
in a Dockerized world? Is the executor going to still exist? Would the
scheduler interact directly with Kubernetes/Mesos instead?

Cheers,
Chris

On Fri, May 13, 2016 at 3:41 AM, Bolke de Bruin <[email protected]> wrote:

> Hi,
>
> We did a video conference on the scheduler with a couple of the committers
> yesterday. The meeting was not there to finalize any roadmap but more to
> get a general understanding of each other's work. To keep it as transparent
> as possible hereby a summary:
>
> Who were attending:
> Max, Paul, Arthur, Dan, Sid, Bolke
>
> The discussion centered around the scheduler sometimes diving into
> connected topic such as pooling and executors. Paul discussed his work on
> making the scheduler more robust against faulty Dags and also to make the
> scheduler faster by not making it dependent on the slowest parsed Dag. PR
> work will be provided shortly to open it up to the community as the aim is
> to have this in by end of Q2 (no promises ;-)).
>
> Continuing the strain of thought of making the scheduler faster the
> separation of executor and scheduler was also discussed. It was remarked by
> Max that doing this separation would essentially create the equivalent of
> the celery workers. Sid mentioned that celery seemed to be a culprit of
> setup issues and people tend to use the local executor instead. The
> discussion was parked as it needs to be discussed with a wider audience
> (mailing list, community) and is not something that we thin is required in
> the near term (obviously PRs are welcome).
>
> Next, we discussed some of the scheduler issues that are marked in the
> attached document (
> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg <
> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg>). Core
> issues discussed were 1) TaskInstances can be created without a DagRun, 2)
> non-intuitive behavior with start_date and also depends_on_past and 3)
> Lineage. It was agreed that the proposal add a previous field to the DagRun
> model and to make backfills (a.o) use DagRun make sense. More discussion
> was around the lineage part as that involves more in depth changes to
> specifically TaskInstances. Still the consensus in the group was that it is
> necessary to make steps here and that they are long overdue.
>
> Lastly, we discussed to draft scheduler roadmap (see doc) to see if there
> were any misalignments. While there are some differences in details we
> think the steps are quite compatible and the differences can be worked out.
>
> So that was it, in case I missed anything correct me. In case of questions
> suggestions etc don’t hesitate and put them on the list.
> Cheers
> Bolke
>

Reply via email to