Good point. Why not :) On Sun, Aug 15, 2021 at 8:18 PM Kaxil Naik <[email protected]> wrote:
> We should probably add API Auth Backends too? > > On Sun, Aug 15, 2021 at 6:59 PM Jarek Potiuk <[email protected]> wrote: > >> Preparatory PR here: https://github.com/apache/airflow/pull/17625 - >> this way we get a list of all secrets/logging handlers in provider.yaml and >> we can use them to generate the doc (and provider info will show them too). >> >> On Sun, Aug 15, 2021 at 6:00 PM Kaxil Naik <[email protected]> wrote: >> >>> 100% agree with Kamil -- They are fundamentally separate and can get out >>> of date as they are published separately. >>> >>> Kamil's proposal looks good to me. >>> >>> On Sun, Aug 15, 2021 at 12:52 AM .... <[email protected]> wrote: >>> >>>> I understand the user's perspective and that it is currently difficult >>>> to discover the list of backend secrets/task handlers that are >>>> distributed in providers packages. I just want to point out that >>>> including this list directly in the apache-airflow documentation >>>> package will have consequences. I would prefer to explain the >>>> difference between the two types of integration and redirect the user >>>> to another page where they can get detailed information. >>>> >>>> There are a few problems that I can see from putting this listing >>>> directly on this page: >>>> 1. The apache-airflow has a different publishing cycle than the >>>> provider packages, so it will be out of date. >>>> 2. Packages for the old version of apache-airflow will contain >>>> information on the integration set that is known only at the time of >>>> the release of that version. We can release integrations that will >>>> still be compatible, but will not be known at the time of the release >>>> of the apache-airflow version. >>>> 3. We do not have * .py files on the v2-*-test branch, so we cannot >>>> verify that the documentation is correct. >>>> 4. We mix two types of documentation - guides and references. This can >>>> make this page difficult to understand as well as find it. >>>> >>>> What I am thinking really is to this kind of formula (It shows how >>>> secrets should look like but it should be applied to task handlers in >>>> similar cases): >>>> >>>> apache-airflow/security/secrets/secrets-backend/index.rst >>>> ############################################ >>>> >>>> Secret Backends: >>>> ============= >>>> >>>> <Paragraph Describe Secret backends in general> >>>> >>>> # Available Secret backends >>>> >>>> Airflow has a built-in backend, but most of the secrets are >>>> distributed independently of it. That means you need to install it >>>> separately, but it's very easy with a pip. This also means that you >>>> can update the secret backend independently of the Airflow core, or >>>> use the secret backend that was released after this Airflow version >>>> was released. >>>> >>>> ## Core Airflow Secret backends: >>>> * <File backend> - link pointing to it >>>> >>>> ## Backends Provided by community-managed providers: >>>> >>>> The list of secrets backend managed by the community is available In >>>> providers packages documentation: :doc:`Secret backend reference >>>> <apache-airflow-providers: providers>`__ >>>> >>>> >>>> ########################################## >>>> >>>> apache-airflow-providers/secrets-backend-ref.rst >>>> ############################################ >>>> >>>> Secret backends refernece >>>> ===================== >>>> >>>> Here’s the list of the secret backends which are available in this >>>> release in providers packages. For general information on Secret >>>> backend, or build-in secret backend, see: <LINK TO SECRET BACKEND> >>>> >>>> * <VaultBackend> >>>> * <AWSSecretBackend> >>>> * <KMSBackend> >>>> >>>> ########################################### >>>> >>>> The existing page describing the operators is similar to my proposal, >>>> so you can see it in the wild >>>> >>>> http://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html#operators >>>> >>>> sob., 14 sie 2021 o 19:19 Jarek Potiuk <[email protected]> napisał(a): >>>> > >>>> > > I am concerned about adding information about the content of >>>> provider >>>> > packages in the core documentation as it is very easy to get obsolete >>>> > >>>> > I agree we should not put any provider details in "core" . But we >>>> should at the very least (I think) put links to all the "community" >>>> providers that implement certain features. >>>> > >>>> > This is really a "discoverability" problem, nothing more. I think we >>>> - long term committers who know all about airflow, providers, etc. are >>>> overestimating user's knowledge about airflow internals - and the >>>> documentation should be there to guide them to learn. >>>> > There was this - very relevant - comic from XKCD day before yesterday >>>> https://xkcd.com/2501/ that shows the mechanism very well. >>>> > >>>> > I tried to put myself in the shoes of a new user. Try to do it Kamil >>>> as well. >>>> > >>>> > When you look at the "logging" or "secrets" section, you are >>>> completely unaware that you can get AWS, GCP and other integrations >>>> provided by the community. And there is NOTHING to tell you otherwise. You >>>> need to know that you should start looking elsewhere - and I want to help >>>> the people who are looking at the page to give the links where they can >>>> find itt. >>>> > Essentially when you do not airflow, do not realise that there are >>>> providers, and do not realise that those providers implement those issues, >>>> You leave with the impression that a lot of stuff is missing. >>>> > >>>> > With the current documentation structure, I am afraid People simply >>>> do not even know that there are community-managed implementations out >>>> there. >>>> > >>>> > What I am thinking really is to this kind of formula (It shows how >>>> secrets should look like but it should be applied across the board in >>>> similar cases): >>>> > >>>> > >>>> > ############################################ >>>> > >>>> > Secret Backends Page: >>>> > >>>> > Paragraph Describe Secret backends in general >>>> > >>>> > # Available Secret backends >>>> > >>>> > ## Core Airflow Secret backends: >>>> > * <File backend> - link pointing to it >>>> > >>>> > ## Backends Provided by community-managed providers: >>>> > * <VaultBackend> >>>> > * <AWSSecretBackend> >>>> > * <KMSBackend> >>>> > >>>> > ########################################## >>>> > >>>> > I think about just links to the appropriate documentation available >>>> in providers. No more, no less. This could be applied (automatically) to >>>> all functionalities provided by providers. >>>> > I think this is safe, can be automated and solves the discoverability >>>> problem. It does not require extra maintenance. >>>> > >>>> > >>>> > J. >>>> > >>>> > >>>> > >>>> > >>>> > On Sat, Aug 14, 2021 at 6:40 PM .... <[email protected]> wrote: >>>> >> >>>> >> Commented above >>>> >> >>>> >> pt., 13 sie 2021 o 03:48 Jarek Potiuk <[email protected]> napisał(a): >>>> >> > >>>> >> >>>> >> > * List (and link) available logging options at >>>> https://airflow.apache.org/docs/apache-airflow/stable/logging-monitoring/logging-tasks.html?highlight=remote%20log#advanced-configuration >>>> .You will not find list of implemented integrations in this page - you >>>> should look for details of advanced logging in providers (but it's not at >>>> all obvious where and that they exist at all). There are no links to S3/GCS >>>> logging configuration/handling and it's not easy to find out where you >>>> should look for them. Better examples would also be useful. >>>> >> > >>>> >> > * Secret Backends page is a bit better - >>>> https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html. >>>> At least it mentions GCP/Hashicorp as "examples" but it misses AWS one and >>>> when you go to "Supported Backends" you see only the "Local Filesystem"one. >>>> I think it is really misleading that you do not have a full list of secret >>>> backends in the community-managed providers. >>>> >> > >>>> >> >>>> >> I am concerned about adding information about the content of provider >>>> >> packages in the core documentation as it is very easy to get obsolete >>>> >> as Airflow and the packages have a different release cycle and the >>>> new >>>> >> packages are compatible with the old Airflow versions so it may not >>>> be >>>> >> obvious that you should be looking at the latest documentation for >>>> >> Airflow to know the full list of providers even if you are using a >>>> >> non-latest version of Airflow. >>>> >> >>>> >> I think it's worth taking an approach similar to operators, where the >>>> >> core documentation does not contain the full list of operators from >>>> >> the provider packages, but only contains a list of operators in the >>>> >> core, and includes references to the documentation for providers that >>>> >> includes this list of operators in provider packages. >>>> >> Here is a reference of all core operators: >>>> >> >>>> https://airflow.apache.org/docs/apache-airflow/stable/operators-and-hooks-ref.html >>>> >> Here is a reference of all operators in providers packages: >>>> >> >>>> https://airflow.apache.org/docs/apache-airflow-providers/operators-and-hooks-ref/index.html >>>> >> >>>> >> The list of operators in the providers' package is automatically >>>> >> generated on the basis of provider.yaml files and the correctness of >>>> >> the file are automatically verified, so we can be sure that the >>>> >> reference is up-to-date and complete. This also reduces the >>>> >> maintenance burden of this documentation. >>>> >> >>>> >> Adding the backend and task handler secret to providers.yaml also >>>> >> means that information about them will be available on the main page >>>> >> of the project in the "Integrations" section. >>>> > >>>> > >>>> > >>>> > -- >>>> > +48 660 796 129 >>>> >>> >> >> -- >> +48 660 796 129 >> > -- +48 660 796 129
