Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

Jarek Potiuk Tue, 31 Jan 2023 00:43:11 -0800

 > 1. How does this impact work of release managers? I see them
already working pretty hard to release providers and Airflow core. I
wonder if this makes their task even harder, especially as these
require more rigorous testing as compared to providers.

Good point. I think it is less of a problem for release managers and
more for those who contributed related changes. Contrary to what you
might know from elsewhere, release managers DO NOT test Airflow when
releasing. Release managers are merely executing the mechanics of the
release preparation and ca make decisions in case someone finds issues
that are potentially blockling:
https://infra.apache.org/release-publishing.html#releasemanager

> Most projects designate a committer to be the release manager who takes 
> responsibility for the mechanics of a release. It is a good idea to let 
> several committers take this role on different releases so that more than one 
> person is comfortable doing a release. Release managers shepherd a release 
> from an initial community consensus to getting the compiled code package to 
> final distribution, and may be involved in publicizing the release to the 
> project's community and the ASF in general.

Our current process for providers already involves asking the
community to test the release candidate, including the Github Issue
automated pinging and asking those who contributed the change to test
it https://github.com/apache/airflow/issues/29103. This absolutely
does not change in the process. If anything, this has actually
potential to be MUCH better tested and safer to use. The usual saying
that I keep on repeating - if something is painful when it comes to
releases - do it more often. If we move K8S executor to the providers
it has two side effects - the changes to k8s executor will come in
smaller increments, and they will never be distributed between the
cncf.kubernetes provider and airflow (which on it's own is recipe for
much more stable behaviour) and - more importantly - users will be
able to go BACK to a working k8s executor when they upgrade
cncf.kubernetes. Let this sink for a while - if you (as deployment
manager) find a problem in k8s executor after upgrading the provider,
you will be able to fix the problem by selective downgrading the
provider to previous version, without having to downgrade airflow. you
will even be able to use executor that you used and loved in Airflow
2.6 (say) even if you upgrade to Airflow 2.7 (providing of course that
our BaseExecutor API is stable). But we can also add some explicit
version limitations in Airflow if we would like to make
back-incompatible changes - and we can do that by adding
min-requirement for cncf.kubernetes provider. There is a lot of
flexibility and control both sides have - maintainers can impose
certain requirements, while users should be free to move within the
limitations set by maintainers.

I'd say all those effects are only positive.

> 2. As part of AIP-44, should we create an internal API for executors? I 
> recall a discussion in Slack (related to ECS) where we aspire to remove all 
> database access from providers.

This is an excellent point that has some interesting consequences.
Glad that you raised it.

We do not need Internal API for executors. In the case of AIP-44 it
does not matter where the code comes from, it matters in which context
the code is executed in (i.e. scheduler/task/triggerer etc.).
Executors are always executed in the scheduler context and that means
that they should use direct DB access - same as all other
scheduler-run code.

But that indeed raises another question for the future "multi-tenant"
setup. Will tenants be able to upgrade their cncf.kubernetes provider
and impact the code run by scheduler? That would be a security problem
if the second part of the question is true. But this is not how it
will work for multi-tenant setup.

By definition, those who run multi-tenant future Airlfow should not
allow tenants to install anything in the scheduler. It's already given
that scheduler has a different set of dependencies (so possibly
different set of providers) installed than the "workers" of the tenant
(and managed by different people - Airflow admins, not Tenant admins).
Each tenant should have (whatever way deployment is managed) their own
set of dependencies for "workers" + "dag file processors", but
scheduler shoudl have it's own set, managed independently. Such
scheduler should even have no "regular" providers installed - just
plugins that extend it's capabilities with timetables and the like. So
we are already in the situation that scheduler has different set of
dependencies than processors and workers.

And this is perfectly fine to have scheduler installed with different
kubernetes provider - acting as plugin extending scheduler
capabilities (managed by the central deployment team) and each tenant
might have the same or a different provider as long as it is
compatible with the version of Airflow used. Scheduler will only use
the "executor" part of the provider, while workers and dag file
processors will only use the KPO part of their provider. No overlap or
dependency there.

I think however (and this is something that we might consider) -
executors could only be used if exposed in providers via plugin
mechanism/entrypoint similarly as Timetables. And for security,
scheduler might only accept executors that are configured via plugin
mechanism. This would be slightly backwards incompatible and is not
strictly necessary (because someone would have to have access to
configuration to use a scheduler which is not coming via plugin) - but
we might consider that to enhance a bit security and add another layer
of it. We already have a precedent where hive provider exposes both -
provider entrypoint and plugin entrypoint, and we could do the same
for cncf.kubernetes, and celery. Worth considering for sure, not
strictly necessary.

>
> Best,
> Shubham
>
>
>
> On 2023-01-29, 1:21 AM, "Jarek Potiuk" <[email protected]> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do not 
> click links or open attachments unless you can confirm the sender and know 
> the content is safe.
>
>
>
>     Hello Everyone,
>
>     As a follow-up to AIP-51 - when it is completed (with few more quirks
>     like the one described by Andrey in the "Rendered Task Instance
>     Fields" discussion) - it should now be entirely possible to move
>     Kubernetes Executor, Celery Executor and related "joined" executors to
>     respective providers.
>
>     There is of course question where to move CeleryKubernetesExecutor,
>     but we have prior art with Transfer Operators and we might choose to
>     add it to "cncf.kubernetes" and make the celery provider an optional
>     dependency of K8S provider.
>
>     This has multiple advantages, and one of the biggest one I can think
>     of is that we could evolve K8S executor faster than Airflow itself and
>     we would avoid quite a lot of complexities involved in parallel
>     modifications of K8S code in provider and executor (it happened in the
>     past that we had do to add very high min-airflow version for
>     "cncf.kubernetes" provider in order to make this happens, and we could
>     likely avoid similar problems by making sure K8S executor is used as
>     "yet another executor" - compliant with the AIP-51 interface and we
>     could evolve it much faster.
>
>     That would also give celery provider some "real" meaning - so far
>     celery provider was merely exposing celery queue sensor, but when we
>     move CeleryExecutor to the provider, the provider would finally be a
>     useful one.
>
>     Doing this requires a few small things:
>     * we likely need to add dynamic import in the old "executors" package
>     following PEP 562 (and the way we've done operators) and deprecation
>     notice in case someone uses them from there.
>     * we likely need to add "cncf.kubernetes" and "celery" packages as
>     pre-installed providers - alongside http, fttp, common.sql, sqlite,
>     imap.
>
>     I **think**, after AIP-51 gets implemented, this would be a fully
>     backwards-compatible change - it's just a matter of proper management
>     of the dependencies. We could add min-cncf.kubernetes/celery versions
>     in Airflow 2.6, so that we are sure 2.6+ airflow uses only newer
>     providers and kubernetes/celery code and this would be fully
>     transparent for the users, I believe.
>
>     J.
>

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

Reply via email to