Hey everyone,

I updated our meeting notes document in the Airflow wiki to capture the
notes from our dev call on Thursday, the 5th of December. The link for
those notes is here
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=308153072#Airflow3Devcall:MeetingNotes-Summary.14>

Loved the progress on the FAB compatibility project, DAG Bundles and
Versioning, Data Assets, and the discussion around Data completeness. Great
work team!

To everyone who attended the meeting, please check the summary and add
anything that I may have missed.
For those who could not join, please let us know if you disagree with
anything discussed and agreed upon in the meeting. Also, please do ask
questions if something is unclear.
There's already an initial agenda for our next dev call, which is scheduled
for 19th Dec. If you would like something to be added to the proposed
agenda for that meeting, please add it here
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=308153072#Airflow3Devcall:MeetingNotes-(Proposed)Agenda.4>
or
let me know.

Best regards and talk to you all soon,
Vikram
--


Below is the summary from the call on Thursday:
--

   - Follow-up on action items from the last call:
      - Update on the FAB provider for backwards compatibility project (Jed
      Cunningham and Vincent Beck):
         - Jed and Vincent shared the progress to date including a PR
         <https://github.com/apache/airflow/pull/44464> that already
         implements plug-in backwards compatibility.
         - Vikram Koka expressed appreciation for the progress and asked
         about the expected timing of the remainder of the items to be done and
         their response was outside of the New UI completeness
blocker, the other
         items could be done by mid-Jan.
         - Jens Scheffler suggested that the PR to validate dependencies
         without the new UI be created as a draft and validated with
the existing
         functionality of the new UI rather than waiting for the new UI to be
         completed.
      - Update on Performance benchmark scenarios (Michal Modras):
         - Augusto shared the thinking around performance benchmark
         scenarios and metrics
         
<https://docs.google.com/document/d/1kyKXkILkHSrkXYCnje-Lev4I983szjfI1tFKhqimj_8/>,
         with a focus on DAG performance and resource consumption.
         - Augusto shared that this was a follow up on the work already
         done on AIP-59 and would be based on the existing performance
framework.
         - There was a significant discussion around the task timings and
         if those were relevant for realistic performance benchmarks.
         - Jens asked if this would cover different executors and Augusto
         responded that this would be Celery first and possibly
Kubernetes executor
         later.
         - Jens and Vikram brought up comparing the performance of Airflow
         2.10 vs. Airflow 3 to identify performance differences.
Augusto confirmed
         that all the tests would be run on both Airflow 2 and 3 to confirm
         performance changes.
      - Development updates and presentations:
      - Update on AIP-75 New Asset-Centric Syntax
      
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-75+New+Asset-Centric+Syntax>
(TP
      Chung):
         - TP shared a recording of the new syntax for asset creation.
         - TP also showed the demo of a new Airflow CLI command to list all
         the Data Assets and to show the details of a specified Data Asset.
         - Finally, TP also introduced the "materialized" command for a
         data asset which ensures that the asset is created by running
the DAG which
         outputs that asset.
      - Update on AIP-66: DAG Bundles & Parsing
      
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816356>
(Jed
      Cunningham):
         - Jed demonstrated the process of defining DAG bundles and how DAG
         bundles would be parsed by the DAG processor
         - He mentioned how some of the changes are happening in
         conjunction with the changes being done in AIP-72.
         - He also showed bundleIDs and bundle Versions. He then showed how
         a new version is parsed and reprocessed.
         - He mentioned that there is much more work to be done, but the
         core of bundle definitions and DAGs being processed from
those bundles is
         now in place.
         - In response to questions, he clarified how DAG Bundles currently
         pull down the entire Git clone into a temporary folder, so
that all DAGs
         and their friends/dependencies could be processed. And that, more
         optimization is very feasible.
      - Update on AIP-78 Scheduler-managed backfill
      
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-78+Scheduler-managed+backfill>
(Daniel
      Standish):
         - Daniel said that all the back-end server work for this AIP as
         scoped has been completed for a bit. He added that the
front-end UI work
         will be done as part of AIP-38.
         - He however added that there is a Data completeness conversation
         to be had as a result, which led to the discussion below.
      - Discussion topics:
      - Data completeness discussion (Daniel Standish):
         - Daniel brought up the concept of implicit data partitioning
         already in Airflow with the concept of execution date, when catchup is
         defined to be True.
         - Daniel advocated making this implicit data partitioning an
         explicit concept in Airflow, arguing that the existing grid
view is already
         an incarnation of the same.
         - At a high level, users could declare that a DAG is
         partition-driven, based on the timetable. Going forward, Backfills or
         catchup would only be supported for partition-driven DAGs.
         - For backwards compatibility, old DAGs would be assumed to be
         partition driven.
         - The immediate reaction from the team is that this is a big
         change and there was significant discussion if this is
absolutely required.
         - Daniel said that the trigger for this was AIP-78
         Scheduler-managed backfill
         
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-78+Scheduler-managed+backfill>
         and AIP-83 Remove Execution Date Unique Constraint from DAG run
         
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-83+Rename+execution_date+-%3E+logical_date+and+remove+unique+constraint>,
         which left a bit of a vacuum between them.
         - The follow-up action item after the discussion was for Daniel to
         share thoughts async and everyone to think about the need for this.
      - Milestone and scope update (Vikram Koka)
         - Vikram shared that at a high level development was on track
         towards the plan shared earlier.
         - However, there would be one scope change with AIP-80 Explicit
         Template Fields in Operator Arguments
         
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-80+Explicit+Template+Fields+in+Operator+Arguments>
         being deferred from 3.0 to a future 3.x release.
      - Action items on/before next dev call:
      - Daniel Standish to post a document regarding explicit vs. implicit
      partitioning and its need as a result of the removal of execution date,
      especially with an eye towards backwards compatibility. Team to consider
      the introduction of a partition concept in Airflow.



<https://www.astronomer.io/>

Reply via email to