1. I find working with connections in a multi-region environment very hard:
We have a DAG that works with some cloud resources (for example S3).
In case the DAG hits any API limitation, error I cannot fall back to
another connection because I don't know if the issue is with the cloud
provider or data itself.

So we need to orchestrate the creation of the connections from outside
based on errors while error handling is happening inside DAG itself.

2. I find implementing security in connections very hard:

Airflow generates a new AWS STS session *(1 hour long)* that we pass to the
docker container where we assume the IAM role from the STS session.
Then Apache Spark will use the assumed role to read-write to the S3 bucket *(
approximately 2 hours) *and fail in the middle because the session is
expired.
In the end, if we use hashicorp vault to get a temporary password/tokenn in
order to interact with external API - we cannot use it *because it will
expire (after 2 hours)*.



On Tue, Jun 15, 2021 at 9:34 PM Canapathy, Subash <[email protected]>
wrote:

> My 2 cents. Roughly based on conversations with Airflow users internal and
> external.
>
>
>
>    - Platformization and multi-tenancy has been hard to solve on Airflow.
>    Large deployments of Airflow (multi-cluster on-prem or cloud) have a need
>    to platformize, manage/govern and vend Airflow as a service to their data
>    engineering communities internally. The feature/means to achieving this
>    will be through 1/ DAG manifests and making DAGs a top level
>    tracked/guid-ed entity that has permissions and context models associated
>    with it so that multiple tenants can operate out of the same Airflow
>    environment.
>    - Expanding DAG folders into workspaces/profiles. Users will benefit
>    from a high level construct to group DAGs. This will unlock more
>    opportunities on permission scoping and templatization of access across all
>    dags in that profile/folder.  This might also have some UX benefits as a
>    side effect – users dislike seeing a thousand DAGs on the list view.
>    - Modernizing the DAGProcessing->Executor loop in order to support
>    remote DAG fetcher, synced DAGs and similar features. This will lessen the
>    reliance to on-disk files as source of truth. While DAG processing can
>    become faster purely based on DAG Manifest + serialized version of DAG. The
>    secondary process can asynchronously work to update the manifest and
>    serialized DAG based on either filesystem (default) or any configured
>    remote DAG fetcher.
>
>
>
> *From: *Ash Berlin-Taylor <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Tuesday, June 15, 2021 at 2:55 AM
> *To: *"[email protected]" <[email protected]>
> *Subject: *[EXTERNAL] Roadmap ideas for Airflow 2.2 and beyond
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi everyone,
>
>
>
> As I'm sure many of you are aware I (along with Aizhamal) am giving the
> opening keynote at this year's Airflow summit, and I'm covering "what's
> next after 2.0" -- essentially what is the roadmap for Airflow for the next
> 12-18 months.
>
>
>
> Since Airflow is a community project first and foremost I'd like to get
> all your ideas, no matter how off the wall :)
>
>
>
> I've got my own ideaas, and 2.2 is fairly firm already (AIPs 39 and 40),
> but 2.3 and beyond starts to get less clear, so if you have something that
> you'd like to see Airflow be able to do or do better, now is the time to
> speak up.
>
>
>
> You don't have to have a solution, just "I find doing X
> hard/annoying/difficult" is enough.
>
>
>
> (And a general reminder: the roadmap is a statement of intent, not a
> promise of timeline or even that a feature will actually be implemented)
>
>
>
> To keep this thread manageable, please can we avoid discussions _*in this
> thread*_ about ideas and keep +1/me too's to a minimum.
>
>
>
> Cheers,
>
> -ash
>

Reply via email to