Cool. Was this a public meeting? Will the next one be?
On 13 May 2016 at 08:20, Chris Riccomini <[email protected]> wrote: > Hey Bolke, > > Thanks for writing this up. I don't have a ton of feedback, as I'm not > terribly familiar with the internals of the scheduler, but two notes: > > 1. A major +1 for the celery/local executor discussion. IMO, Celery is a > net-negative on this project, and should be fully removed in favor of the > LocalExecutor. Splitting the scheduler from the executor in the > LocalExecutor would basically give parity with Celery, AFAICT, and sounds > much easier to operate to me. > 2. If we are moving towards Docker as a container for DAG execution in the > future, it's probably worth considering how these changes are going to > affect the Docker implementation. If we do pursue (1), how does this look > in a Dockerized world? Is the executor going to still exist? Would the > scheduler interact directly with Kubernetes/Mesos instead? > > Cheers, > Chris > > On Fri, May 13, 2016 at 3:41 AM, Bolke de Bruin <[email protected]> wrote: > >> Hi, >> >> We did a video conference on the scheduler with a couple of the committers >> yesterday. The meeting was not there to finalize any roadmap but more to >> get a general understanding of each other's work. To keep it as transparent >> as possible hereby a summary: >> >> Who were attending: >> Max, Paul, Arthur, Dan, Sid, Bolke >> >> The discussion centered around the scheduler sometimes diving into >> connected topic such as pooling and executors. Paul discussed his work on >> making the scheduler more robust against faulty Dags and also to make the >> scheduler faster by not making it dependent on the slowest parsed Dag. PR >> work will be provided shortly to open it up to the community as the aim is >> to have this in by end of Q2 (no promises ;-)). >> >> Continuing the strain of thought of making the scheduler faster the >> separation of executor and scheduler was also discussed. It was remarked by >> Max that doing this separation would essentially create the equivalent of >> the celery workers. Sid mentioned that celery seemed to be a culprit of >> setup issues and people tend to use the local executor instead. The >> discussion was parked as it needs to be discussed with a wider audience >> (mailing list, community) and is not something that we thin is required in >> the near term (obviously PRs are welcome). >> >> Next, we discussed some of the scheduler issues that are marked in the >> attached document ( >> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg < >> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg>). Core >> issues discussed were 1) TaskInstances can be created without a DagRun, 2) >> non-intuitive behavior with start_date and also depends_on_past and 3) >> Lineage. It was agreed that the proposal add a previous field to the DagRun >> model and to make backfills (a.o) use DagRun make sense. More discussion >> was around the lineage part as that involves more in depth changes to >> specifically TaskInstances. Still the consensus in the group was that it is >> necessary to make steps here and that they are long overdue. >> >> Lastly, we discussed to draft scheduler roadmap (see doc) to see if there >> were any misalignments. While there are some differences in details we >> think the steps are quite compatible and the differences can be worked out. >> >> So that was it, in case I missed anything correct me. In case of questions >> suggestions etc don’t hesitate and put them on the list. >> Cheers >> Bolke >>
