Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

Oliveira, Niko Tue, 25 Jan 2022 15:05:37 -0800

Hey folks,


Very excited about this AIP!

I work on the AWS OSS Airflow team. Getting the AWS system tests running has 
been a pet-project of mine for the past little while. I come from a test 
automation background, so this is dear to my heart :)


Currently I have a branch that contains the implementation of the various 
methods (mostly around environment and credential configuration) for AWS system 
tests, but I was running into very obscure runtime issues. So I'm glad to see 
one of the goals of this AIP is to simplify this process, since it is 
convoluted and clunky.


Here are some high level thoughts after my read through the AIP:


1. “CI integration can be built using GitHub CI or provider-related solution“ - 
I'm happy (and excited) to collaborate on this front! I already have a proof of 
concept running (internally to AWS) built with AWS Code Pipeline and Code 
Build, which runs various tests each time my personal fork is updated from 
upstream. But we could easily make something like this public and then connect 
it to the Airflow repo to verify commits, PRs, etc.


2. After reading through the AIP, I still don't think I truly grok the 
SYSTEM_TESTS_ENV_ID environment variable. I understand its use, particularly 
for uniqueness, but who is the owner for setting this? What precisely does it 
represent? Including an example of an actual value for that env variable in the 
AIP would be helpful!

3. The AIP reads: “Maintaining system tests requires knowledge about breeze, 
pytest and maintenance of special files like variables.env or credential files. 
Without those, we will simplify the architecture and improve management over 
tests.”
    The AIP talks a lot about removing this type of setup from the Airflow 
system test platform to simplify things and that "All needed permissions to 
external services for execution of DAGs (tests) should be provided to the 
Airflow instance in advance." and that "Each provider should create an 
instruction explaining how to prepare the environment to run related system 
tests so that users can do it on their own".
   In general the AIP reads as if it's solved this problem, but it's more like 
it has absolved itself from solving this problem, which is much different. I 
think this approach could possibly make things even worse as now there is no 
contract or interface for how to plumb configuration and credentials to the 
system test dags. The current set of methods and files to plumb credentials 
through aren't great (and as of now are quite Google specific) but I think this 
interface can be simplified and improved rather than just exported wholesale 
for each provider to re-invent a new approach.


4. Lastly, regarding the actual construction of the tests themselves, the 
proposed design is full of statements like:
   - "If a teardown task(s) has been defined, remember to add 
trigger_rule="all_done" parameter to the operator call. This will make sure 
that this task will always run even if the upstream fails"
   - “Make sure to include these parameters into DAG call: ...“

   - “Change Airflow Executor to DebugExecutor by placing this line at the top 
of the file: ...”

   - “Try to keep tasks in the DAG body defined in an order of execution.”


   A system that relies on good intentions like "be sure to remember to do X 
otherwise bad thing Y will happen" certainly guaranties that bad thing Y will 
happen, frequently. Humans are fallible and not great at sticking to a contract 
or interface that isn't codified. And this AIP is littered with statements like 
this. We need test infrastructure that's easier to use and will also enforce 
these best practices/requirements which are needed for the tests to run.
In general, it reads much more like a guideline on best practices rather than a 
new and improved system test engine.


Thanks for taking the time to create this AIP I'm very eager to get system 
testing up and running in Airflow and I'd love to collaborate further on it!

Cheers,
Niko


________________________________
From: Jarek Potiuk <[email protected]>
Sent: Tuesday, January 25, 2022 8:57 AM
To: [email protected]
Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Just let me add a bit more context as I see it.

The AIP-47 design (following the AIP-4 idea) is targeted to handle those things:

* streamline the execution of the System Tests (which are basically executable 
"example_dags" of ours which we already have in our documentation for providers)
* make them integrated and automatically execute in a "cloud" environment - 
with real cloud services behind
* make use of the 100s of System Tests already written in Google Provider (and 
many others which we have  in in other providers - for example AWS should be 
close to follow)
* test automation of those using Google Cloud Platform donated credits (from 
what I understand Google is willing to sponsor credits to run those tests 
regularly).
* I am quite sure when it works also AWS will donate credits for our CI that we 
could use for that.
* and we should be able to extend it to other providers as well once the "CI 
environment" is in place.

When this one is implemented, we should be finally able not only to run "unit" 
tests with all PRs for a lot of provider code, but also we will be able to 
regularly (not with all PR but likely daily) run the example dags with the real 
services and make sure that at least big part of our Providers code 
(Hooks/Operators) are still "working as expected" - at least that they pass 
"smoke tests".

Currently we rely mostly on the manual tests done when we release providers 
(and our community members are actually great here - as every month we have a 
great deal of help from within the community), but when we have system tests 
automation, the confidence of releasing new versions of providers will go up 
significantly.

I am really looking forward to discussion on this one and implementation and 
deployment for a number of providers.

j.


On Tue, Jan 25, 2022 at 2:13 PM Mateusz Nojek 
<[email protected]<mailto:[email protected]>> wrote:
Hello everyone,

I created a new AIP draft titled AIP-47 New design of Airflow System 
Tests<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests>.
It is related to the new design of system tests that improves how they are 
written and run which should contribute to higher quality of tests and thus to 
lower quantity of bugs related to the operators. The AIP-47 is located under 
AIP-4<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+Automation+of+System+Tests+for+external+systems>
 as its direct successor (it more deeply describes some of the concepts). The 
basic target was to simplify tests execution and enable running them more 
regularly.

I was working on it with Eugene Kostieiev and Bartlomiej Hirsz for some time 
and soon we are also going to publish a PR with changes based on the content of 
this AIP.

For now, we would like to start a discussion with the developers and 
contributors of the community of Apache Airflow. The idea was already discussed 
with Jarek Potiuk, the original author of AIP-4, who cooperated with us on some 
of the design details.

Please, tell me what you think about it and share your ideas, concerns and 
comments.

Kind regards,
Mateusz Nojek

Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

Reply via email to