I like this proposal a lot.

I think we are talking about almost the same, just the question about
technicalities of where to generate stuff automatically, summarizing the
options.

I think that's a good idea to have separate summary of all "secrets",
"logs", "authentication" options which can be directly linked from
"apache-airlfow" docs - I will be super happy.
As long as in the "Secrets documentation" we have a section "Here we have
all Airflow Community Secret Backend" - pointing to the separate "Summary
of all Community-provided secrets" I am perfectly fine :)

It's really the "Discovery" that is missing and "Direct Link" from the
place where General "Secrets" are described is missing now.

J.


On Sun, Aug 15, 2021 at 1:52 AM .... <[email protected]> wrote:

> I understand the user's perspective and that it is currently difficult
> to discover the list of backend secrets/task handlers that are
> distributed in providers packages. I just want to point out that
> including this list directly in the apache-airflow documentation
> package will have consequences. I would prefer to explain the
> difference between the two types of integration and redirect the user
> to another page where they can get detailed information.
>
> There are a few problems that I can see from putting this listing
> directly on this page:
> 1. The apache-airflow has a different publishing cycle than the
> provider packages, so it will be out of date.
> 2. Packages for the old version of apache-airflow will contain
> information on the integration set that is known only at the time of
> the release of that version. We can release integrations that will
> still be compatible, but will not be known at the time of the release
> of the apache-airflow version.
> 3. We do not have * .py files on the v2-*-test branch, so we cannot
> verify that the documentation is correct.
> 4. We mix two types of documentation - guides and references. This can
> make this page difficult to understand as well as find it.
>
> What I am thinking really is to this kind of formula (It shows how
> secrets should look like but it should be applied to task handlers in
> similar cases):
>
> apache-airflow/security/secrets/secrets-backend/index.rst
> ############################################
>
> Secret Backends:
> =============
>
> <Paragraph Describe Secret backends in general>
>
> # Available Secret backends
>
> Airflow has a built-in backend, but most of the secrets are
> distributed independently of it. That means you need to install it
> separately, but it's very easy with a pip. This also means that you
> can update the secret backend independently of the Airflow core, or
> use the secret backend that was released after this Airflow version
> was released.
>
> ## Core Airflow Secret backends:
>  * <File backend> - link pointing to it
>
> ## Backends Provided by community-managed providers:
>
> The list of secrets backend managed by the community is available In
> providers packages documentation: :doc:`Secret backend reference
> <apache-airflow-providers: providers>`__
>
>
> ##########################################
>
> apache-airflow-providers/secrets-backend-ref.rst
> ############################################
>
> Secret backends refernece
> =====================
>
> Here’s the list of the secret backends which are available in this
> release in providers packages. For general information on Secret
> backend, or build-in secret backend, see: <LINK TO SECRET BACKEND>
>
>  * <VaultBackend>
>  * <AWSSecretBackend>
>  * <KMSBackend>
>
> ###########################################
>
> The existing page describing the operators is similar to my proposal,
> so you can see it in the wild
>
> http://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html#operators
>
> sob., 14 sie 2021 o 19:19 Jarek Potiuk <[email protected]> napisał(a):
> >
> > > I am concerned about adding information about the content of provider
> > packages in the core documentation as it is very easy to get obsolete
> >
> > I agree we should not put any provider details in "core" . But we should
> at the very least (I think) put links to all the "community" providers that
> implement certain features.
> >
> > This is really a "discoverability" problem, nothing more. I think we -
> long term committers who know all about airflow, providers, etc. are
> overestimating user's knowledge about airflow internals - and the
> documentation should be there to guide them to learn.
> > There was this - very relevant - comic from XKCD day before yesterday
> https://xkcd.com/2501/ that shows the mechanism very well.
> >
> > I tried to put myself in the shoes of a new user. Try to do it Kamil as
> well.
> >
> > When you look at the "logging" or "secrets" section, you are completely
> unaware that you can get AWS, GCP and other integrations provided by the
> community. And there is NOTHING to tell you otherwise. You need to know
> that you should start looking elsewhere - and I want to help the people who
> are looking at the page to give the links where they can find itt.
> > Essentially when you do not airflow, do not realise that there are
> providers, and do not realise that those providers implement those issues,
> You leave with the impression that a lot of stuff is missing.
> >
> > With the current documentation structure, I am afraid People simply do
> not even know that there are community-managed implementations out there.
> >
> > What I am thinking really is to this kind of formula (It shows how
> secrets should look like but it should be applied across the board in
> similar cases):
> >
> >
> > ############################################
> >
> > Secret Backends Page:
> >
> > Paragraph Describe Secret backends in general
> >
> > # Available Secret backends
> >
> > ## Core Airflow Secret backends:
> >  * <File backend> - link pointing to it
> >
> > ## Backends Provided by community-managed providers:
> >  * <VaultBackend>
> >  * <AWSSecretBackend>
> >  * <KMSBackend>
> >
> > ##########################################
> >
> > I think about just links to the appropriate documentation available in
> providers. No more, no less. This could be applied (automatically) to all
> functionalities provided by providers.
> > I think this is safe, can be automated and solves the discoverability
> problem. It does not require extra maintenance.
> >
> >
> > J.
> >
> >
> >
> >
> > On Sat, Aug 14, 2021 at 6:40 PM .... <[email protected]> wrote:
> >>
> >> Commented above
> >>
> >> pt., 13 sie 2021 o 03:48 Jarek Potiuk <[email protected]> napisał(a):
> >> >
> >>
> >> > * List (and link) available logging options at
> https://airflow.apache.org/docs/apache-airflow/stable/logging-monitoring/logging-tasks.html?highlight=remote%20log#advanced-configuration
> .You will not find list of implemented integrations in this page - you
> should look for details of advanced logging in providers (but it's not at
> all obvious where and that they exist at all). There are no links to S3/GCS
> logging configuration/handling and it's not easy to find out where you
> should look for them. Better examples would also be useful.
> >> >
> >> > * Secret Backends page is a bit better -
> https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html.
> At least it mentions GCP/Hashicorp as "examples" but it misses AWS one and
> when you go to "Supported Backends" you see only the "Local Filesystem"one.
> I think it is really misleading that you do not have a full list of secret
> backends in the community-managed providers.
> >> >
> >>
> >> I am concerned about adding information about the content of provider
> >> packages in the core documentation as it is very easy to get obsolete
> >> as Airflow and the packages have a different release cycle and the new
> >> packages are compatible with the old Airflow versions so it may not be
> >> obvious that you should be looking at the latest documentation for
> >> Airflow to know the full list of providers even if you are using a
> >> non-latest version of Airflow.
> >>
> >> I think it's worth taking an approach similar to operators, where the
> >> core documentation does not contain the full list of operators from
> >> the provider packages, but only contains a list of operators in the
> >> core, and includes references to the documentation for providers that
> >> includes this list of operators in provider packages.
> >> Here is a reference of all core operators:
> >>
> https://airflow.apache.org/docs/apache-airflow/stable/operators-and-hooks-ref.html
> >> Here is a reference of all operators in providers packages:
> >>
> https://airflow.apache.org/docs/apache-airflow-providers/operators-and-hooks-ref/index.html
> >>
> >> The list of operators in the providers' package is automatically
> >> generated on the basis of provider.yaml files and the correctness of
> >> the file are automatically verified, so we can be sure that the
> >> reference is up-to-date and complete. This also reduces the
> >> maintenance burden of this documentation.
> >>
> >> Adding the backend and task handler secret to providers.yaml also
> >> means that information about them will be available on the main page
> >> of the project in the "Integrations" section.
> >
> >
> >
> > --
> > +48 660 796 129
>


-- 
+48 660 796 129

Reply via email to