Thanks everyone for participating today - we had again quite a number of people interested.
Meeting notes here: https://docs.google.com/document/d/14mzVkvm5GheCCAcMUzOBN9hw-2aWZKQgQYDzTJIiGFg/edit# Video Recording here: https://drive.google.com/file/d/13cRteIuB-6NJm8lKWACiO_VMTWOkEHCR/view?usp=drive_web And Chat transcript here: https://drive.google.com/file/d/1i4vnwB_2tcQHkaywcse6FsWpufp3kj9f/view?usp=drive_web I invented a nice way to quickly enlist all attendees :). See yourself in the notes :) Summary of the Multi Tenancy discussion #4 today (this was a bit more than Multi-Tenancy but all things discussed were related to the work we do in this area with some cross-dependencies to existing/proposed AIPs):: Notes: - AIP-43 (DAG Processor separation) -> Mateusz: In Progress. Things are progressing without surprises - AIP-44 (Internal API)-> Jarek: came with some variation on a modified approach on how to implement logic of the replaced function. Speculatively: - using RPC in-memory or using local TCP/Unix domain socket might improve the maintainability and make it easier to implement DB isolation (to be checked by benchmarking if the overhead is not a problem) - Apache Thrift and gRPC have been proposed as viable implementations - Discussion around scalability, hops, SSL and deployment scenario: leads to the conclusion that we need to describe this in the documentation (and link to the documentation of the chosen technology for RPC - regarding deployment). - Before voting some more benchmarking and testing is needed (Jarek with the engagement/help of Evgeni and Giorgio). Evgeni has an experience from Databand with similar approaches and this can be reused. - Ping: presented AIP-45 (Remove Double Dag Parsing): - Ash: Potential problem with deps when dynamically set (they are not serializable). Possibly can be replaced by Scheduler doing extra work. - The savings are mostly important for big DAGs and run_as_user scenario only - General consensus: idea is good, needs clarification of the deps case but seems like everyone like the solution especially that it shifts some code in the way that is good (airflow local without DAG parsing, at all, airflow run doing the heavy lifting, scheduler doing a bit more with ‘deps” - Ping: present AIP-46 (Docker Runtime Isolation): - Good Idea - We all agree that this should be an optional add-on rather than Airflow Feature. Instead of implementing it in the core of Airflow, Airflow should be extended with necessary hook, that will enable to provide a “matching” runtimes for Parsing and Scheduling - Rather than trying to implement Docker Runtime code that Airflow Community should maintain - this way AirBnb or others can provide their own Parse/Execute “runtime” implementation. Action items - Mateusz: continues AIP-43 - Jarek (+Evgeni/Giorgio): benchmarking/analysis of implementation details for AIP-44 - Ping: AIP-45 - deeper dive on what to do with deps - Ping: AIP-46 - look at updating the AIP-46 with details and description on how to modify Airflow to allow pluggable runtimes for parsing and execution (which might get AirBnB example implementation using Docker Runtime). J.
