I mentioned this on the call yesterday as well. Going forward, all meetings
will be community-inclusive. We can follow what Apache Beam is doing ( they
have 10-15+ video windows at a time ) in this respect. We will need a topic and
agenda for each meetings, so that they are not misconstrued as "office-hours"
or free discussion meetings. We can hold those as well, but the topic and
agenda of each meeting will help effectively manage larger meetings.
We can use Gitter, Twitter, Confluence, and the dev list to announce the
meetings and share the agenda ahead of time.
Also, I feel there is no need for committers to be the sole initiators of these
meetings. We should make that clear to the community. However, if some
users/contributors do set up a meeting, it may be a good idea for some
committers to attend to help answer any questions, etc...
-s
On Friday, May 13, 2016 4:55 PM, Jakob Homan <[email protected]> wrote:
Cool. Was this a public meeting? Will the next one be?
On 13 May 2016 at 08:20, Chris Riccomini <[email protected]> wrote:
> Hey Bolke,
>
> Thanks for writing this up. I don't have a ton of feedback, as I'm not
> terribly familiar with the internals of the scheduler, but two notes:
>
> 1. A major +1 for the celery/local executor discussion. IMO, Celery is a
> net-negative on this project, and should be fully removed in favor of the
> LocalExecutor. Splitting the scheduler from the executor in the
> LocalExecutor would basically give parity with Celery, AFAICT, and sounds
> much easier to operate to me.
> 2. If we are moving towards Docker as a container for DAG execution in the
> future, it's probably worth considering how these changes are going to
> affect the Docker implementation. If we do pursue (1), how does this look
> in a Dockerized world? Is the executor going to still exist? Would the
> scheduler interact directly with Kubernetes/Mesos instead?
>
> Cheers,
> Chris
>
> On Fri, May 13, 2016 at 3:41 AM, Bolke de Bruin <[email protected]> wrote:
>
>> Hi,
>>
>> We did a video conference on the scheduler with a couple of the committers
>> yesterday. The meeting was not there to finalize any roadmap but more to
>> get a general understanding of each other's work. To keep it as transparent
>> as possible hereby a summary:
>>
>> Who were attending:
>> Max, Paul, Arthur, Dan, Sid, Bolke
>>
>> The discussion centered around the scheduler sometimes diving into
>> connected topic such as pooling and executors. Paul discussed his work on
>> making the scheduler more robust against faulty Dags and also to make the
>> scheduler faster by not making it dependent on the slowest parsed Dag. PR
>> work will be provided shortly to open it up to the community as the aim is
>> to have this in by end of Q2 (no promises ;-)).
>>
>> Continuing the strain of thought of making the scheduler faster the
>> separation of executor and scheduler was also discussed. It was remarked by
>> Max that doing this separation would essentially create the equivalent of
>> the celery workers. Sid mentioned that celery seemed to be a culprit of
>> setup issues and people tend to use the local executor instead. The
>> discussion was parked as it needs to be discussed with a wider audience
>> (mailing list, community) and is not something that we thin is required in
>> the near term (obviously PRs are welcome).
>>
>> Next, we discussed some of the scheduler issues that are marked in the
>> attached document (
>> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg <
>> https://drive.google.com/open?id=0B_Y7S4YFVWvYM1o0aDhKMjJhNzg>). Core
>> issues discussed were 1) TaskInstances can be created without a DagRun, 2)
>> non-intuitive behavior with start_date and also depends_on_past and 3)
>> Lineage. It was agreed that the proposal add a previous field to the DagRun
>> model and to make backfills (a.o) use DagRun make sense. More discussion
>> was around the lineage part as that involves more in depth changes to
>> specifically TaskInstances. Still the consensus in the group was that it is
>> necessary to make steps here and that they are long overdue.
>>
>> Lastly, we discussed to draft scheduler roadmap (see doc) to see if there
>> were any misalignments. While there are some differences in details we
>> think the steps are quite compatible and the differences can be worked out.
>>
>> So that was it, in case I missed anything correct me. In case of questions
>> suggestions etc don’t hesitate and put them on the list.
>> Cheers
>> Bolke
>>