Hey everyone,

Thank you for attending the dev call on Thursday. I updated
our meeting notes on the Airflow wiki and the link for those notes is here
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886699#Airflow3.xDevCall:Meetingnotes-Summary.31>

To everyone who attended the meeting, please check the summary and add
anything that I may have missed. For those who could not join, please let
us know if you disagree with anything discussed and agreed upon in
the meeting. Also, please do ask questions if something is unclear.

Our next meeting is scheduled for the 26th of February at the same time.
The agenda is already populated, primarily with swim lane updates and
Airflow 3.2 AIP updates. If you would like to keep this call to discuss a
particular topic, please let me know if you would like to add anything to
the agenda
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886699#Airflow3.xDevCall:Meetingnotes-ProposedAgenda.33>
.

Best regards,
Vikram
--
Below is the summary from the call:

   - Swim lane updates:
      - UI Test framework (Rahul Vats):
         - Rahul shared that we now have good coverage on the E2E UI tests
         with more tests having been added over the last two weeks bringing the
         total now to 83.
         - He said that some of the end to end tests have been moved to
         being unit tests, so that the CI times could be reduced, and
that this work
         was ongoing, since the CI was now around 14-15 minutes
despite the move
         because of the additional tests.
      - UI / API update (Pierre):
         - Pierre confirmed that API issues were only for those supporting
         the UIs. The API issues supporting task execution were now
flagged under
         TaskSDK.
         - Pierre shared that the team was leveraging the test deployment
         created by Rahul containing millions of task instance records for
         performance issue identification, and that there was great community
         participation in this process, including endpoint
improvements with caching.
      - Airflow 3.2 development updates:
      - AIP-76 Asset Partitions (Wei Lee):
         -  Wei Lee shared a recorded demo showcasing the progress made to
         date on Asset Partitions, which showed support for date based
partitions,
         leveraging timetables.
         - The concepts of flexible date based partitions such as Hourly,
         Daily, Weekly, etc., were clearly demonstrated in the demo
and the overall
         demo was very well received by the team.
         - There were some questions around the partition keys and the
         visibility of those partition keys in the UI, which the team
agreed to take
         offline with Wei. Vikram also requested community feedback on
potential
         issues with mismatched partition keys and conditions for subsequent
         triggering.
      - AIP-86 Deadline alerts (Dennis):
         - Dennis said that the last two PRs were ready for review, with
         synchronous callbacks working in the local executor and were
abstracted in
         the base executor.
         - Dennis shared that Celery implementation was nearly complete and
         that would be asking for community help on the other executors.
         - Dennis said that they were well positioned to hit the code
         freeze target date of the week of Feb 26th.
         - Vikram to follow up with Dennis async regarding the
         configuration tradeoffs around sync callback execution, concurrency
         controls, and timeliness of alerts.
      - AIP-67 Multi-team (Niko / Vincent):
         - Vincent said that they were working on adding minimal multi-team
         functionality to simple auth manager for testing.
         - Rajesh said that Niko had asked for community help on the
         Kubernetes executor for multi-team, but that there wasn't
much progress
         here yet.
      - AIP-98 Async Python Operator (David Blain):
         - Vikram enquired if there was any progress on the documentation
         around the Async Python Operator, specifically the usage guidance as
         compared to Deferrable operators.
         - There didn't seem to be, so Vikram to follow-up async with David
         on this.
      - Discussion topics:
      - AIP-99 Common data access patterns (Pavan)
         - Pavan presented a comprehensive overview of the work planned for
         this AIP, specifically including: SQL Query generation using
DB schemas,
         Human-in-the-loop review before execution, Data transfer operators via
         DataFusion, and based on support for all existing Airflow
database hooks
         - Pavan also showed a quick demo which covered: Automatic schema
         fetching from Postgres, SQL generation with validation, and XCom
         integration for query results.
         - Pavan said that the implementation approach would be to start
         with basic SQL operators and would then expand to multiple databases.
         - The overview and demo was very well received by the team.
         - Vikram asked for broader interfaces to be defined first, before
         going broad with database support and Pavan agreed with that guidance.
      - AIP-100 Task Priorities (Natanel and Theo S):
         - Natanel shared the analysis and draft design approaches written
         up by the two of them (Theo could not make the call) regarding
         priority-based scheduling within Airflow.
         - Natanel said that the current priority-based scheduling causes
         starvation at significant scale, when concurrency limits are hit with
         worker saturation. Based on their research, the proposed
solutions built on
         top of a combination of priority + aging.
         - Jens shared his feedback (through Vikram) that he very much
         appreciated the analysis, but was unconvinced about any of
the currently
         proposed solutions.
         - Vikram also commended the team on their research into the
         problem, and added that he was surprised that task priorities
still existed
         in Airflow, saying that he had though we had deprecated them
a long time
         ago (since Airflow 2)
         - The general consensus was that the topic needed greater in-depth
         offline thought before proceeding to a conclusion towards an
algorithm and
         a migration strategy.


-- 

Vikram Koka
Chief Strategy Officer
Email: [email protected]


<https://www.astronomer.io/>

Reply via email to