A few thoughts on moving away from Celery and around the Executor interface. To me LocalExecutor means local as "in-process" and it's implemented as a local multiprocess pool/queue, so making it remote or "out of process" changes its definition or premise. Let's then refer to what we're really talking about as "creating a RemoteExecutor that doesn't depend on Celery".
Now about this RemoteExecutor idea. If you boil it down to its essence, remote executor isn't that far from Celery in itself: each worker process listen for messages, is parameterized to have N slots, manages and returns its state, maybe listen to only certain types of message (let's call those queues). Now how do we circulate messages around? We have a database, let's use the database as a message queue maybe? Databases don't make for scalable message queues, should we support a proper message queue? What about Redis? RabbitMQ? Kafka? SQS? Let's write an interface for that. Wait am I talking about RemoteExecutor or CeleryExecutor at this point? Maybe RemoteExecutor is CeleryExecutor. Side note: celery supports using SqlAlchemy (any database) as a message queue. Maybe that it the default setup there. We don't point people to using LocalExecutor, but CeleryExecutor with the DB as a backend. Max On Fri, May 13, 2016 at 10:05 AM, Bolke de Bruin <[email protected]> wrote: > > It was but it wasn't broadly communicated. We will repeat it, with an open > invitation, every week or two weeks. > > Now to figure out how to share a video link that works continuously > without me or someone else being there every time... > > B. > > Sent from my iPhone > > > On 13 mei 2016, at 18:55, Jakob Homan <[email protected]> wrote: > > > > Cool. Was this a public meeting? Will the next one be? > > > >> On 13 May 2016 at 08:20, Chris Riccomini <[email protected]> wrote: > >> Hey Bolke, > >> > >> Thanks for writing this up. I don't have a ton of feedback, as I'm not > >> terribly familiar with the internals of the scheduler, but two notes: > >> > >> 1. A major +1 for the celery/local executor discussion. IMO, Celery is a > >> net-negative on this project, and should be fully removed in favor of > the > >> LocalExecutor. Splitting the scheduler from the executor in the > >> LocalExecutor would basically give parity with Celery, AFAICT, and > sounds > >> much easier to operate to me. > >> 2. If we are moving towards Docker as a container for DAG execution in > the > >> future, it's probably worth considering how these changes are going to > >> affect the Docker implementation. If we do pursue (1), how does this > look > >> in a Dockerized world? Is the executor going to still exist? Would the > >> scheduler interact directly with Kubernetes/Mesos instead? > >> > >> Cheers, > >> Chris > >> > >>> On Fri, May 13, 2016 at 3:41 AM, Bolke de Bruin <[email protected]> > wrote: > >>> > >>> Hi, > >>> > >>> We did a video conference on the scheduler with a couple of the > committers > >>> yesterday. The meeting was not there to finalize any roadmap but more > to > >>> get a general understanding of each other's work. To keep it as > transparent > >>> as possible hereby a summary: > >>> > >>> Who were attending: > >>> Max, Paul, Arthur, Dan, Sid, Bolke > >>> > >>> The discussion centered around the scheduler sometimes diving into > >>> connected topic such as pooling and executors. Paul discussed his work > on > >>> making the scheduler more robust against faulty Dags and also to make > the > >>> scheduler faster by not making it dependent on the slowest parsed Dag. > PR > >>> work will be provided shortly to open it up to the community as the > aim is > >>> to have this in by end of Q2 (no promises ;-)). > >>> > >>> Continuing the strain of thought of making the scheduler faster the > >>> separation of executor and scheduler was also discussed. It was > remarked by > >>> Max that doing this separation would essentially create the equivalent > of > >>> the celery workers. Sid mentioned that celery seemed to be a culprit of > >>> setup issues and people tend to use the local executor instead. The > >>> discussion was parked as it needs to be discussed with a wider audience > >>> (mailing list, community) and is not something that we thin is > required in > >>> the near term (obviously PRs are welcome). > >>> > >>> Next, we discussed some of the scheduler issues that are marked in the > >>> attached document ( > >>> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg < > >>> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg>). Core > >>> issues discussed were 1) TaskInstances can be created without a > DagRun, 2) > >>> non-intuitive behavior with start_date and also depends_on_past and 3) > >>> Lineage. It was agreed that the proposal add a previous field to the > DagRun > >>> model and to make backfills (a.o) use DagRun make sense. More > discussion > >>> was around the lineage part as that involves more in depth changes to > >>> specifically TaskInstances. Still the consensus in the group was that > it is > >>> necessary to make steps here and that they are long overdue. > >>> > >>> Lastly, we discussed to draft scheduler roadmap (see doc) to see if > there > >>> were any misalignments. While there are some differences in details we > >>> think the steps are quite compatible and the differences can be worked > out. > >>> > >>> So that was it, in case I missed anything correct me. In case of > questions > >>> suggestions etc don’t hesitate and put them on the list. > >>> Cheers > >>> Bolke > >>> >
