TL;DR; I have a Green and ready for review PR that should vastly improve
the stability of CI and make our development environment much more
enjoyable: https://github.com/apache/airflow/pull/7091

>From the first experiences it is much more stable on Travis and we also
have a working version with Github Actions so we should be able to move to
GA shortly.

It brings huge improvements to CI environment as well to local development
one. It will be much easier and more convenient to reproduce failing CI
tests.

This is all documented in TESTING.rst and BREEZE.rst but here some summary
of what I came up with.

*Using pytest markers to separate Integration/Kubernetes tests:*

1) Appropriate tests are marked now with
*pytest.mark.integration("<integration>")* marker (*cassandra kerberos
mongo openldap rabbitmq redis*) - in case you do not have an integration
enabled they will be skipped and the message will clearly say what
integration is needed. You can also run only the tests for those
integrations only with *pytest --integration cassandra* for example of *pytest
--integration all*

2) There are some tests that require/work with certain backends only (*sqlite
mysql postgres*). They are marked with *pytest.markers.backend("mysql")* or
even *pytest.mark.backend("mysql", "postgres")* - if they are supposed to
work with more than one backend.  There are ~30 such tests. I corrected all
custom skips/skipifs/skipunless to those new markers. Same as with
integrations you can run all backend tests with *pytest --backend mysql*

3) There are Kubernetes tests that are marked with
*pytest.mark.runtime("kubernetes")
*. I chose a different marker (runtime) for those because they are
completely independent and require more fundamental environment change -
starting and deploying Kubernetes cluster. There are ~30 such tests. Again
you can run only those tests with *pytest --runtime kubernetes.*

4) Breeze is now by default not running any additional
containers/integrations - only base airflow. There are *~ 4400 *tests that
should run in that environment not requiring any integrations (i.e. other
containers) or runtime (i.e. kubernetes). So vast majority of the tests
will run with this default setup. This makes the whole "Breeze" experience
a lot more enjoyable as the default setting requires far less resources on
your local machine.

5) You can also start Breeze with *--integration* flag (for example *./breeze
--integration cassandra*) so that you have the integration to run
appropriate integration tests (for example cassandra tests).

6) You can also start Breeze with "*--start-kind-cluster*" . This will
start "Kubernetes in Docker" cluster. And with two scripts you can deploy
Airflow to this cluster and run Kubernetes tests there.

7) Last but not least - we have more jobs in Travis (soon moving to Github
Actions). These jobs should be far more stable, because either they do not
require integrations to be started (so much less memory needed) or they run
only integrations/kubernetes tests (they run 30 tests each so the memory
requirement is also much smaller).

We see much better now which tests are skipped and why and it will allow us
to further cleanup our test base now.

You can see a successful build here:
https://travis-ci.org/apache/airflow/builds/636205535

Looking forward to prompt review and merging it quickly.

J.



-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to