Hey all, I have updated our meeting notes document to summarize the discussion from our 27th June dev call for Airflow 3.0.
Link: https://cwiki.apache.org/confluence/x/8ApeEg#Airflow3Devcall:MeetingNotes-27June2024 To all those who attended, can you please double-check and add if I have missed anything? To all those who didn't join, if you disagree with anything in the Summary, please voice your opinion. I will send a separate email for the agenda for the next meeting on 11th July. Regards, Kaxil ------ Including the Summary here too (might break formatting): Catch-up on action items from last call - Kaxil Naik updated the AIP template <https://cwiki.apache.org/confluence/pages/templates2/viewpagetemplate.action?entityId=90210323&key=AIRFLOW> to include the "Migration effort" section. If the template is used to create a new AIP doc (click "Create" → "Airflow Improvement Proposal" on the wiki) it should pre-populate sections. The access to editing the template itself is limited to PMC members & ASF Confluence Admin group. Need to retroactively add that section in existing AIPs planned for 3.0 [image: Screenshot of How to create a page using template] - SLA PR <https://github.com/apache/airflow/pull/36639>: This is still pending review. - Marked AIP-51 <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airflow?src=contextnavpagetreemode> as completed, and the pending UI issue <https://github.com/apache/airflow/issues/27933> will be covered in the UI refactor. - AIP-61: Niko Oliveira is still targeting 2.10 for completion. PRs are raised & pending reviews: #40472 (Backfill part) <https://github.com/apache/airflow/pull/40472> & #40017 (Scheduler part) <https://github.com/apache/airflow/pull/40017>. Discuss: Workstreams & workstream owners (Airflow 3 Workstreams <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3+Workstreams>) Updates on the *existing AIPs for the 3.0* list - AIP-66 (DAG Versioning—Execution side): Jedidiah Cunningham will have the draft by the next dev call. - AIP-72 (Task Execution SDK) : No updates this week. Ash Berlin-Taylor aims to have the VOTE open for it by the next dev call. - AIP-38 (UI Modernization): Brent is focussing on Airflow 2.10 deliverables; post that, he will focus on replacing this AIP with a new Umbrella AIP. There was a consensus on the call that we should split this AIP into multiple AIPs (with an Umbrella AIP) as below: - React'ifying the UI & changing the UX required for a modern Webserver - FAB removal: The removal of Flask Appbuilder will have backend implications such as changes needed to Authentication (currently, it is done via FAB Auth backends <https://flask-appbuilder.readthedocs.io/en/latest/security.html>.), changes to the Plugin interface, custom API endpoints etc. - New Features like DAG Folders/DAG Groups - AIP-57 Refactor SLA Feature: The initial AIP was created with 2.x in mind, so it should be revised. Some things might be easier with 3.0, as we can make breaking changes. One of the contention points last time was around whether the SLA duration time should start from DAG/task start or should be based on absolute time, as reflected in this PR comment <https://github.com/apache/airflow/pull/36639#issuecomment-2021080626>. Shubham Mehta and Kaxil Naik will reach out to Sungwon Yun to see if he is interested in being part of this effort. - AIP-65 (DAG Versioning—UI side): Before the next dev call, Jedidiah Cunningham will decide whether to keep this AIP or make it part of any existing DAG Versioning or UI Modernization AIP. - AIP-67 (Multi-team): Jarek Potiuk will revise this AIP based on all the discussions and new AIPs to present the overview in the next dev call. Shubham Mehta and his team are planning to co-own/contribute to this AIP with Jarek. - AIP-68 (Extending plugin interface): Jens Scheffler & Brent will create a POC PR to determine the direction to take. Jens will do it before the next dev call to then get the final feedback on 2.10.x vs 3.0. - AIP-69 (Remote Executor): Jens Scheffler organized a call with interested contributors a day after the dev call to get feedback. A summary of it is posted here <https://lists.apache.org/thread/h2nxkto0lxgjnqj8yps0qsh7ppbccx6g>. Updates on the "*other candidates for 3.0*" list: - *Enhanced Data Awareness*: Tzu-ping Chung and Constance Martineau created draft AIPs (AIP-74 <https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-74+Introducing+Data+Assets> , AIP-75 <https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-75+New+Asset-Centric+Syntax> , AIP-76 <https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-76+Asset+Partitions> , AIP-77 <https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-77+Asset+Validations>) for this workstream (under Umbrella AIP: AIP-73 <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-73+Expanded+Data+Awareness>). These AIPs are marked for Airflow 3.0 but AIP-77 might move to Airflow 3.1, TBD for now. - *Scheduler-Managed Backills *(aka Backfills at scale): Daniel Standish has an internal draft for it that will be published around next week. - *Poll external Datasets to have event-based DAG scheduling: *Vincent BECK volunteered to own this epic for *3.0*. He and Shubham Mehta plan to publish a draft AIP in the next 2 weeks. - Dennis Ferruzzi has volunteered to own the "*Inspect & Simplify Airflow Configurations*" & "*Inspect & Revamp the cardinality of Metrics*" streams. - *Synchronous DAG Execution*: No update this week but targeting a proposal after the next dev call - *Make Execution Date non-unique for a DAG*: No update for now; TP is focussing on the Data Awareness AIPs. - *Scheduler Performance Improvements*: Some areas in the scheduler have been identified for improvements. If anyone has any specific areas of the Scheduler loop that they would like to own, please add them to the list on Airflow 3 Workstreams <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3+Workstreams>. The performance goals would be tied to the specific area of optimization rather than the scheduler as a whole. - *Consider Developing Airflow and Providers separately*. In the coming weeks, Kaxil Naik will draft a pro-cons list in a document to send to the mailing list around this epic. - *Respect permissions in CLI*: Buğra Öztürk has volunteered to own this stream and, ideally, will start working on a proposal next. - *Improve security of Airflow Supply Chain*: Once things are more defined, Jarek Potiuk will send a document about this stream of work describing its impact on dependencies in the coming month. - *Observability of Callbacks on UI*: Some of this might be covered in AIP-72 & AIP-69 but we will keep this item in the table so we don't forget about it. - *Remove StatsD and replace it with Prometheus as a first-class citizen*: We agreed that we should replace StatsD with OpenTelemetry and not Prometheus. It needs an owner who could figure out the impact of breaking changes since StatsD metrics are utilized by all users, including Airflow service providers, to monitor and alert on the health of Airflow deployments. - *Overhaul Operator Templating behaviour*: Tzu-ping Chung mentioned he would be happy to write the design document & asked for someone to work on the implementation. Shahar Epstein expressed his interest in working on this effort. - Shubham Mehta expressed interest in working on a native DAG Factory that is currently on the 3.1 list. We would keep this for 3.1 for now and reevaluate based on the progress of other epics. - There have been no noteworthy updates on any other streams. Kaxil Naik will create GitHub issues for items in that table with no owners and post them on the mailing list to see if anyone is interested in leading it. If there are no takers by the end of August, they will be moved to the 3.1+ list. The next dev call will be on 11 Jul 2024.