Preparatory PR here: https://github.com/apache/airflow/pull/17625  - this
way we get a list of all secrets/logging handlers in provider.yaml and we
can use them to generate the doc (and provider info will show them too).

On Sun, Aug 15, 2021 at 6:00 PM Kaxil Naik <[email protected]> wrote:

> 100% agree with Kamil -- They are fundamentally separate and can get out
> of date as they are published separately.
>
> Kamil's proposal looks good to me.
>
> On Sun, Aug 15, 2021 at 12:52 AM .... <[email protected]> wrote:
>
>> I understand the user's perspective and that it is currently difficult
>> to discover the list of backend secrets/task handlers that are
>> distributed in providers packages. I just want to point out that
>> including this list directly in the apache-airflow documentation
>> package will have consequences. I would prefer to explain the
>> difference between the two types of integration and redirect the user
>> to another page where they can get detailed information.
>>
>> There are a few problems that I can see from putting this listing
>> directly on this page:
>> 1. The apache-airflow has a different publishing cycle than the
>> provider packages, so it will be out of date.
>> 2. Packages for the old version of apache-airflow will contain
>> information on the integration set that is known only at the time of
>> the release of that version. We can release integrations that will
>> still be compatible, but will not be known at the time of the release
>> of the apache-airflow version.
>> 3. We do not have * .py files on the v2-*-test branch, so we cannot
>> verify that the documentation is correct.
>> 4. We mix two types of documentation - guides and references. This can
>> make this page difficult to understand as well as find it.
>>
>> What I am thinking really is to this kind of formula (It shows how
>> secrets should look like but it should be applied to task handlers in
>> similar cases):
>>
>> apache-airflow/security/secrets/secrets-backend/index.rst
>> ############################################
>>
>> Secret Backends:
>> =============
>>
>> <Paragraph Describe Secret backends in general>
>>
>> # Available Secret backends
>>
>> Airflow has a built-in backend, but most of the secrets are
>> distributed independently of it. That means you need to install it
>> separately, but it's very easy with a pip. This also means that you
>> can update the secret backend independently of the Airflow core, or
>> use the secret backend that was released after this Airflow version
>> was released.
>>
>> ## Core Airflow Secret backends:
>>  * <File backend> - link pointing to it
>>
>> ## Backends Provided by community-managed providers:
>>
>> The list of secrets backend managed by the community is available In
>> providers packages documentation: :doc:`Secret backend reference
>> <apache-airflow-providers: providers>`__
>>
>>
>> ##########################################
>>
>> apache-airflow-providers/secrets-backend-ref.rst
>> ############################################
>>
>> Secret backends refernece
>> =====================
>>
>> Here’s the list of the secret backends which are available in this
>> release in providers packages. For general information on Secret
>> backend, or build-in secret backend, see: <LINK TO SECRET BACKEND>
>>
>>  * <VaultBackend>
>>  * <AWSSecretBackend>
>>  * <KMSBackend>
>>
>> ###########################################
>>
>> The existing page describing the operators is similar to my proposal,
>> so you can see it in the wild
>>
>> http://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html#operators
>>
>> sob., 14 sie 2021 o 19:19 Jarek Potiuk <[email protected]> napisał(a):
>> >
>> > > I am concerned about adding information about the content of provider
>> > packages in the core documentation as it is very easy to get obsolete
>> >
>> > I agree we should not put any provider details in "core" . But we
>> should at the very least (I think) put links to all the "community"
>> providers that implement certain features.
>> >
>> > This is really a "discoverability" problem, nothing more. I think we -
>> long term committers who know all about airflow, providers, etc. are
>> overestimating user's knowledge about airflow internals - and the
>> documentation should be there to guide them to learn.
>> > There was this - very relevant - comic from XKCD day before yesterday
>> https://xkcd.com/2501/ that shows the mechanism very well.
>> >
>> > I tried to put myself in the shoes of a new user. Try to do it Kamil as
>> well.
>> >
>> > When you look at the "logging" or "secrets" section, you are completely
>> unaware that you can get AWS, GCP and other integrations provided by the
>> community. And there is NOTHING to tell you otherwise. You need to know
>> that you should start looking elsewhere - and I want to help the people who
>> are looking at the page to give the links where they can find itt.
>> > Essentially when you do not airflow, do not realise that there are
>> providers, and do not realise that those providers implement those issues,
>> You leave with the impression that a lot of stuff is missing.
>> >
>> > With the current documentation structure, I am afraid People simply do
>> not even know that there are community-managed implementations out there.
>> >
>> > What I am thinking really is to this kind of formula (It shows how
>> secrets should look like but it should be applied across the board in
>> similar cases):
>> >
>> >
>> > ############################################
>> >
>> > Secret Backends Page:
>> >
>> > Paragraph Describe Secret backends in general
>> >
>> > # Available Secret backends
>> >
>> > ## Core Airflow Secret backends:
>> >  * <File backend> - link pointing to it
>> >
>> > ## Backends Provided by community-managed providers:
>> >  * <VaultBackend>
>> >  * <AWSSecretBackend>
>> >  * <KMSBackend>
>> >
>> > ##########################################
>> >
>> > I think about just links to the appropriate documentation available in
>> providers. No more, no less. This could be applied (automatically) to all
>> functionalities provided by providers.
>> > I think this is safe, can be automated and solves the discoverability
>> problem. It does not require extra maintenance.
>> >
>> >
>> > J.
>> >
>> >
>> >
>> >
>> > On Sat, Aug 14, 2021 at 6:40 PM .... <[email protected]> wrote:
>> >>
>> >> Commented above
>> >>
>> >> pt., 13 sie 2021 o 03:48 Jarek Potiuk <[email protected]> napisał(a):
>> >> >
>> >>
>> >> > * List (and link) available logging options at
>> https://airflow.apache.org/docs/apache-airflow/stable/logging-monitoring/logging-tasks.html?highlight=remote%20log#advanced-configuration
>> .You will not find list of implemented integrations in this page - you
>> should look for details of advanced logging in providers (but it's not at
>> all obvious where and that they exist at all). There are no links to S3/GCS
>> logging configuration/handling and it's not easy to find out where you
>> should look for them. Better examples would also be useful.
>> >> >
>> >> > * Secret Backends page is a bit better -
>> https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html.
>> At least it mentions GCP/Hashicorp as "examples" but it misses AWS one and
>> when you go to "Supported Backends" you see only the "Local Filesystem"one.
>> I think it is really misleading that you do not have a full list of secret
>> backends in the community-managed providers.
>> >> >
>> >>
>> >> I am concerned about adding information about the content of provider
>> >> packages in the core documentation as it is very easy to get obsolete
>> >> as Airflow and the packages have a different release cycle and the new
>> >> packages are compatible with the old Airflow versions so it may not be
>> >> obvious that you should be looking at the latest documentation for
>> >> Airflow to know the full list of providers even if you are using a
>> >> non-latest version of Airflow.
>> >>
>> >> I think it's worth taking an approach similar to operators, where the
>> >> core documentation does not contain the full list of operators from
>> >> the provider packages, but only contains a list of operators in the
>> >> core, and includes references to the documentation for providers that
>> >> includes this list of operators in provider packages.
>> >> Here is a reference of all core operators:
>> >>
>> https://airflow.apache.org/docs/apache-airflow/stable/operators-and-hooks-ref.html
>> >> Here is a reference of all operators in providers packages:
>> >>
>> https://airflow.apache.org/docs/apache-airflow-providers/operators-and-hooks-ref/index.html
>> >>
>> >> The list of operators in the providers' package is automatically
>> >> generated on the basis of provider.yaml files and the correctness of
>> >> the file are automatically verified, so we can be sure that the
>> >> reference is up-to-date and complete. This also reduces the
>> >> maintenance burden of this documentation.
>> >>
>> >> Adding the backend and task handler secret to providers.yaml also
>> >> means that information about them will be available on the main page
>> >> of the project in the "Integrations" section.
>> >
>> >
>> >
>> > --
>> > +48 660 796 129
>>
>

-- 
+48 660 796 129

Reply via email to