Hey folks,
Very excited about this AIP!
I work on the AWS OSS Airflow team. Getting the AWS system tests running has
been a pet-project of mine for the past little while. I come from a test
automation background, so this is dear to my heart :)
Currently I have a branch that contains the implementation of the various
methods (mostly around environment and credential configuration) for AWS system
tests, but I was running into very obscure runtime issues. So I'm glad to see
one of the goals of this AIP is to simplify this process, since it is
convoluted and clunky.
Here are some high level thoughts after my read through the AIP:
1. “CI integration can be built using GitHub CI or provider-related solution“ -
I'm happy (and excited) to collaborate on this front! I already have a proof of
concept running (internally to AWS) built with AWS Code Pipeline and Code
Build, which runs various tests each time my personal fork is updated from
upstream. But we could easily make something like this public and then connect
it to the Airflow repo to verify commits, PRs, etc.
2. After reading through the AIP, I still don't think I truly grok the
SYSTEM_TESTS_ENV_ID environment variable. I understand its use, particularly
for uniqueness, but who is the owner for setting this? What precisely does it
represent? Including an example of an actual value for that env variable in the
AIP would be helpful!
3. The AIP reads: “Maintaining system tests requires knowledge about breeze,
pytest and maintenance of special files like variables.env or credential files.
Without those, we will simplify the architecture and improve management over
tests.”
The AIP talks a lot about removing this type of setup from the Airflow
system test platform to simplify things and that "All needed permissions to
external services for execution of DAGs (tests) should be provided to the
Airflow instance in advance." and that "Each provider should create an
instruction explaining how to prepare the environment to run related system
tests so that users can do it on their own".
In general the AIP reads as if it's solved this problem, but it's more like
it has absolved itself from solving this problem, which is much different. I
think this approach could possibly make things even worse as now there is no
contract or interface for how to plumb configuration and credentials to the
system test dags. The current set of methods and files to plumb credentials
through aren't great (and as of now are quite Google specific) but I think this
interface can be simplified and improved rather than just exported wholesale
for each provider to re-invent a new approach.
4. Lastly, regarding the actual construction of the tests themselves, the
proposed design is full of statements like:
- "If a teardown task(s) has been defined, remember to add
trigger_rule="all_done" parameter to the operator call. This will make sure
that this task will always run even if the upstream fails"
- “Make sure to include these parameters into DAG call: ...“
- “Change Airflow Executor to DebugExecutor by placing this line at the top
of the file: ...”
- “Try to keep tasks in the DAG body defined in an order of execution.”
A system that relies on good intentions like "be sure to remember to do X
otherwise bad thing Y will happen" certainly guaranties that bad thing Y will
happen, frequently. Humans are fallible and not great at sticking to a contract
or interface that isn't codified. And this AIP is littered with statements like
this. We need test infrastructure that's easier to use and will also enforce
these best practices/requirements which are needed for the tests to run.
In general, it reads much more like a guideline on best practices rather than a
new and improved system test engine.
Thanks for taking the time to create this AIP I'm very eager to get system
testing up and running in Airflow and I'd love to collaborate further on it!
Cheers,
Niko
________________________________
From: Jarek Potiuk <[email protected]>
Sent: Tuesday, January 25, 2022 8:57 AM
To: [email protected]
Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests
CAUTION: This email originated from outside of the organization. Do not click
links or open attachments unless you can confirm the sender and know the
content is safe.
Just let me add a bit more context as I see it.
The AIP-47 design (following the AIP-4 idea) is targeted to handle those things:
* streamline the execution of the System Tests (which are basically executable
"example_dags" of ours which we already have in our documentation for providers)
* make them integrated and automatically execute in a "cloud" environment -
with real cloud services behind
* make use of the 100s of System Tests already written in Google Provider (and
many others which we have in in other providers - for example AWS should be
close to follow)
* test automation of those using Google Cloud Platform donated credits (from
what I understand Google is willing to sponsor credits to run those tests
regularly).
* I am quite sure when it works also AWS will donate credits for our CI that we
could use for that.
* and we should be able to extend it to other providers as well once the "CI
environment" is in place.
When this one is implemented, we should be finally able not only to run "unit"
tests with all PRs for a lot of provider code, but also we will be able to
regularly (not with all PR but likely daily) run the example dags with the real
services and make sure that at least big part of our Providers code
(Hooks/Operators) are still "working as expected" - at least that they pass
"smoke tests".
Currently we rely mostly on the manual tests done when we release providers
(and our community members are actually great here - as every month we have a
great deal of help from within the community), but when we have system tests
automation, the confidence of releasing new versions of providers will go up
significantly.
I am really looking forward to discussion on this one and implementation and
deployment for a number of providers.
j.
On Tue, Jan 25, 2022 at 2:13 PM Mateusz Nojek
<[email protected]<mailto:[email protected]>> wrote:
Hello everyone,
I created a new AIP draft titled AIP-47 New design of Airflow System
Tests<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests>.
It is related to the new design of system tests that improves how they are
written and run which should contribute to higher quality of tests and thus to
lower quantity of bugs related to the operators. The AIP-47 is located under
AIP-4<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+Automation+of+System+Tests+for+external+systems>
as its direct successor (it more deeply describes some of the concepts). The
basic target was to simplify tests execution and enable running them more
regularly.
I was working on it with Eugene Kostieiev and Bartlomiej Hirsz for some time
and soon we are also going to publish a PR with changes based on the content of
this AIP.
For now, we would like to start a discussion with the developers and
contributors of the community of Apache Airflow. The idea was already discussed
with Jarek Potiuk, the original author of AIP-4, who cooperated with us on some
of the design details.
Please, tell me what you think about it and share your ideas, concerns and
comments.
Kind regards,
Mateusz Nojek