Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

Mateusz Nojek Tue, 01 Feb 2022 01:34:28 -0800

Hello everyone,

Thanks for the questions, Niko! I will try to address them now...

   1. The proposed solution is to have these provider-specific projects for
   running DAGs being set up for system tests execution and use GitHub CI just
   as a trigger for them. This way we will unload the GitHub CI workers and
   stream execution outside of GitHub. AFAIK Google is going to transfer some
   credits to GCP project dedicated for system tests in community. If other
   companies go the same way, we will be able to execute all system tests
   regularly and thus improve quality of Airflow.
   2. SYSTEM_TESTS_ENV_ID is a variable that needs to be set up by the user
   running system tests. It can be a name of the user, some random value or
   whatever. It can be used by scripts (e.g. in the CI process) that will
   create multiple environments for testing specific operators on different
   versions of Airflow, and thanks to this variable we can avoid potential
   collisions of data structures.
   The design purposely avoids setting it automatically because different
   users can have different needs of using it. If it's not the case, then user
   can just put anything there (e.g. user name) and run the tests.
   The instruction will be documented inside the README file for system
   tests.
   3. The main idea of this AIP is that the system tests are self-contained
   and independent of any provider-specific configuration. Each provider may
   require different configuration for the system tests to work and we don't
   want to enforce any general setup that will need to be followed by others.
   Tests need to remain simple so that one can write them without deep diving
   into the complex configuration (like it was before when some specific
   credential files needed to be injected before pytest test). In general,
   tests are independent of the configuration but it's up to the provider to
   prepare configuration instruction for their system tests to work
   (preferably in some README file). It's up to the provider how simple (or
   complex) configuration will be.
   4. The simplicity of system tests is based on the fact that they are
   just DAGs and they look like (old) example dags. Thanks to that we don't
   need to separatly maintain these DAGs (that serve also as part of the
   automatically generated documentation) and system tests, but we have it all
   as a one. We are aware that nobody's perfect and people do mistakes but at
   the same time we believe that users have the best knowledge how they want
   to use system tests. That's why it's important to have short but very
   specific documentation with instructions how to run these tests.
   Regarding the parts that sound more like best practices instead of
   strict rules... Enforcing users to use one-and-only way to configure tests
   will be very limiting. It is very hard to predict all possible users'
   behaviors, thus we cannot predict e.g. which task will be teardown and
   which not - it's definitely up to the test designer. The good news is that
   we can support users in a way that we will create pre-commit hooks that
   will check if the most significant parts are included in the test file
   (e.g. pytest wrapper or DebugExecutor declaration). Also we can prepare a
   script that will generate dummy system test file with all these parts and
   some comments so that user will just need to fill these parts according to
   his/her needs (something like invoke <https://www.pyinvoke.org/>script).

Thanks again for the questions and I hope it clarifies the things. If not,
we are happy to answer further comments and questions.

Kind regards,
Mateusz Nojek

śr., 26 sty 2022 o 00:05 Oliveira, Niko <[email protected]>
napisał(a):

> Hey folks,
>
>
> Very excited about this AIP!
>
>
> I work on the AWS OSS Airflow team. Getting the AWS system tests running
> has been a pet-project of mine for the past little while. I come from a
> test automation background, so this is dear to my heart :)
>
>
> Currently I have a branch that contains the implementation of the various
> methods (mostly around environment and credential configuration) for AWS
> system tests, but I was running into very obscure runtime issues. So I'm
> glad to see one of the goals of this AIP is to simplify this process, since
> it is convoluted and clunky.
>
>
> Here are some high level thoughts after my read through the AIP:
>
>
> 1. “*CI integration can be built using GitHub CI or provider-related
> solution*“ - I'm happy (and excited) to collaborate on this front! I
> already have a proof of concept running (internally to AWS) built with AWS
> Code Pipeline and Code Build, which runs various tests each time my
> personal fork is updated from upstream. But we could easily make something
> like this public and then connect it to the Airflow repo to verify commits,
> PRs, etc.
>
>
> 2. After reading through the AIP, I still don't think I truly grok the
> SYSTEM_TESTS_ENV_ID environment variable. I understand its *use*,
> particularly for uniqueness, but who is the owner for setting this? What
> precisely does it represent? Including an example of an actual value for
> that env variable in the AIP would be helpful!
>
> 3. The AIP reads: “*Maintaining system tests requires knowledge about
> breeze, pytest and maintenance of special files like variables.env or
> credential files. Without those, we will simplify the architecture and
> improve management over tests.*”
>     The AIP talks a lot about removing this type of setup from the Airflow
> system test platform to simplify things and that "*All needed permissions
> to external services for execution of DAGs (tests) should be provided to
> the Airflow instance in advance.*" and that "*Each provider should create
> an instruction explaining how to prepare the environment to run related
> system tests so that users can do it on their own*".
>    In general the AIP reads as if it's solved this problem, but it's more
> like it has absolved itself from solving this problem, which is much
> different. I think this approach could possibly make things even worse as
> now there is no contract or interface for how to plumb configuration and
> credentials to the system test dags. The current set of methods and files
> to plumb credentials through aren't great (and as of now are quite Google
> specific) but I think this interface can be simplified and improved rather
> than just exported wholesale for each provider to re-invent a new approach.
>
>
> 4. Lastly, regarding the actual construction of the tests themselves, the
> proposed design is full of statements like:
>    - "*If a teardown task(s) has been defined, remember to add
> trigger_rule="all_done" parameter to the operator call. This will make sure
> that this task will always run even if the upstream fails*"
>    - “*Make sure to include these parameters into DAG call: ...*“
>
>    - “*Change Airflow Executor to DebugExecutor by placing this line at
> the top of the file: ...*”
>
>    - “*Try to keep tasks in the DAG body defined in an order of
> execution.*”
>
>
>    A system that relies on good intentions like "be sure to remember to do
> X otherwise bad thing Y will happen" certainly guaranties that bad thing Y
> will happen, frequently. Humans are fallible and not great at sticking to a
> contract or interface that isn't codified. And this AIP is littered with
> statements like this. We need test infrastructure that's easier to use and
> will also enforce these best practices/requirements which are needed for
> the tests to run.
> In general, it reads much more like a guideline on best practices rather
> than a new and improved system test engine.
>
>
>
> Thanks for taking the time to create this AIP I'm very eager to get system
> testing up and running in Airflow and I'd love to collaborate further on it!
>
> Cheers,
> Niko
>
>
> ------------------------------
> *From:* Jarek Potiuk <[email protected]>
> *Sent:* Tuesday, January 25, 2022 8:57 AM
> *To:* [email protected]
> *Subject:* RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow
> System Tests
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
> Just let me add a bit more context as I see it.
>
> The AIP-47 design (following the AIP-4 idea) is targeted to handle those
> things:
>
> * streamline the execution of the System Tests (which are basically
> executable "example_dags" of ours which we already have in our
> documentation for providers)
> * make them integrated and automatically execute in a "cloud" environment
> - with real cloud services behind
> * make use of the 100s of System Tests already written in Google Provider
> (and many others which we have  in in other providers - for example AWS
> should be close to follow)
> * test automation of those using Google Cloud Platform donated credits
> (from what I understand Google is willing to sponsor credits to run those
> tests regularly).
> * I am quite sure when it works also AWS will donate credits for our CI
> that we could use for that.
> * and we should be able to extend it to other providers as well once the
> "CI environment" is in place.
>
> When this one is implemented, we should be finally able not only to run
> "unit" tests with all PRs for a lot of provider code, but also we will be
> able to regularly (not with all PR but likely daily) run the example dags
> with the real services and make sure that at least big part of our
> Providers code (Hooks/Operators) are still "working as expected" - at least
> that they pass "smoke tests".
>
> Currently we rely mostly on the manual tests done when we release
> providers (and our community members are actually great here - as every
> month we have a great deal of help from within the community), but when we
> have system tests automation, the confidence of releasing new versions of
> providers will go up significantly.
>
> I am really looking forward to discussion on this one and implementation
> and deployment for a number of providers.
>
> j.
>
>
> On Tue, Jan 25, 2022 at 2:13 PM Mateusz Nojek <[email protected]> wrote:
>
>> Hello everyone,
>>
>> I created a new AIP draft titled AIP-47 New design of Airflow System
>> Tests
>> <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests>
>> .
>> It is related to the new design of system tests that improves how they
>> are written and run which should contribute to higher quality of tests and
>> thus to lower quantity of bugs related to the operators. The AIP-47 is
>> located under AIP-4
>> <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+Automation+of+System+Tests+for+external+systems>
>> as its direct successor (it more deeply describes some of the concepts).
>> The basic target was to simplify tests execution and enable running them
>> more regularly.
>>
>> I was working on it with Eugene Kostieiev and Bartłomiej Hirsz for some
>> time and soon we are also going to publish a PR with changes based on the
>> content of this AIP.
>>
>> For now, we would like to start a discussion with the developers and
>> contributors of the community of Apache Airflow. The idea was already
>> discussed with Jarek Potiuk, the original author of AIP-4, who cooperated
>> with us on some of the design details.
>>
>> Please, tell me what you think about it and share your ideas, concerns
>> and comments.
>>
>> Kind regards,
>> Mateusz Nojek
>>
>

Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

Reply via email to