Hey everyone,

Thank you for attending the dev call on Thursday. I updated our meeting
notes document in the Airflow 3.x wiki
<https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.x>to capture
the notes. The link for those notes is here
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886699#Airflow3.xDevCall:Meetingnotes-Summary.11>

The meeting continued the focus on user feedback regarding Airflow 3 and
solving adoption issues. I have also updated the Airflow 3.x wiki page
<https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.x> with a
specific "Airflow 3 adoption focus" section.

To everyone who attended the meeting, please check the summary and add
anything that I may have missed. For those who could not join, please let
us know if you disagree with anything discussed and agreed upon in
the meeting. Also, please do ask questions if something is unclear.

Our next meeting is scheduled for the 20th of November at the same time. Please
let me know if you would like to add anything to the agenda
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886699#Airflow3.xDevCall:Meetingnotes-ProposedAgenda.12>
.

Best regards,
Vikram
--
Below is the summary from the call:

   - Catch-up on action items from last call:
      - DAG import issues (Dheeraj)
         - Dheeraj said that he had re-tested the upgrade process and that
         the RUFF based utilities had significantly improved DAG
compatibility from
         Airflow 2 to Airflow 3, when run with autofix, with over 50%
of all the
         DAGs successfully parsed with Airflow 3, without needing
manual changes.
         - Dheeraj went to to say that the remaining issues requiring
         manual fixes were with:
            - airflow.utils days_ago method,
            - DB create session no longer being available, because of
            direct database access removal
            - Simple HTTP Operator deprecation and Bash Operator being
            moved
         - Dheeraj's summary was that the migration timeline after using
         the utilities would be about 3-4 days to achieve 80-90% DAG
compatibility
         - The only remaining issue in his mind was UI performance, where
         it seemed that there was a noticeable slowdown as compared to
Airflow 2.x
         - This report raised a fair amount of questions and discussion in
         the meeting itself. It was very helpful for the rest of the
team to hear
         Dheeraj's feedback!


   - Development Updates and Presentations:
      - Airflow 3.1.x patch release update (Ephraim Anierobi)
         - Ephraim said that 3.1.2 had been released successfully.
         - Jarek reported that there was one issue reported right after
         about disappearing logs which may be critical and require a
follow-on patch
         release.
         - Rahul chimed in to say that this log issue was reproducible and
         a fix had also been identified and tested.
         - There was agreement that this may require a 3.1.3 very soon,
         instead of waiting for the 2 week release cycle.
         - This is currently scheduled for this week and added to the Airflow
         3.x wiki page
         <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.x>
      - UI performance issues (Pierre Jeambrun)
         - Pierre reported that a number of N+1 query problems had been
         identified, were being resolved, and guard rails being put in
place. The
         root cause was serialization layer lazy-loading relationships
in loops.
         - Pierre also referenced an issue that had identified missing
         indexes as a source of slowness and this was being resolved
by new index
         creation. Vikram raised his concern that new index creation
could cause
         issues in the "DB migration" part of an Airflow upgrade. Ash
concurred with
         the concern and proposed a solution to make index creation
part of the API
         server/ Scheduler startup rather than as part of the migration.
         - Brent added that the Grid view performance remains challenging
         and that additional optimization work was being planned after
the N+1 fixes
         were complete.
         - There was also discussion about FastAPI configuration changes
         because of scaling differences from the Flask approach. This
triggered a
         need for updating the documentation to recommend scaling approach
         recommendations.
      - Auth issues (Vincent Beck)
         - Vincent reported that issues related to Auth were being resolved
         and that he had taken this on at Vikram's request.
      - Expanding Task SDK Integration test framework with more tests
      (Amogh)
         - There was a quick ask for help from Amogh requesting community
         contributions to the Task SDK integration test framework.
         - Amogh said that the complexity was higher than previous efforts
         and may require a SIG on slack for coordination.
      - Discussion topics:
      - Issue triage process (Vikram)
         - Vikram followed up on his email summary of issues sent to the
         dev list earlier, saying that the "needs triage" label was
applied to 73 of
         the 284 open issues related to Airflow 3. And that this still
seemed to be
         applied even after a PR had been created to address the issue.
         - Jarek chimed in to say that this was unintentional and that at
         least he himself often forgot to assign or remove labels
during the review
         process.
         - Vikram proposed adoption of issues into logical swim lanes, with
         volunteer owners for those lanes such as:
            - Auth issues: Vincent leading
            - UI / API issues: Pierre and Brent leading
            - Data aware scheduling: TP and Wei leading
            - Edge Worker: Jens leading
         - There was some discussion around this with senior contributors
         such as Ash, saying that they look at everything, not based
on individual
         areas. However, Vincent and Jens chimed in saying that this
would be useful
         for them to focus attention. Brent chimed in saying that
sometimes issues
         were mislabeled with the UI tag, but that was solvable by
reassigning the
         UI-labelled issues post-initial triage.
         - An aspiring contributor commented that these swim lane labels
         would also be useful for issues tagged with "Good first
issue", so that
         they could pick something to work on based on their own skills and
         interests.
         - Elad pointed out that this needs to be tried out in practice and
         if it works, also applied to PRs, since there are many PRs
sitting waiting
         for approval. At this point, we have hit a record of 343 open
PRs in the
         project!
      - Thoughts on how to document DB access options in Airflow 3 upgrade
      docs (Amogh)
         - Amogh said that the Database access options topic had raised a
         lot of discussion on the PR.
         - As a result, Amogh started a dev list discussion and was looking
         for input. Based on that, a lazy consensus would be started
middle of next
         week

--

Vikram Koka
Chief Strategy Officer
Email: [email protected]


<https://www.astronomer.io/>

Reply via email to