Hi Jarek, I like what you've done with the separate integrations, and that coupled with pytest markers and better "import error" handling in the tests would make it easier to run a sub-set of the tests without having to install everything (for instance not having to install mysql client libs. Admittedly less of a worry with breeze/docker, but still would be nice to skip/deselct tests when deps aren't there) Most of these PRs are merged now, I've glanced over #7091 and like the look of it, good work! You'll let us know when we should take a deeper look? For cassandra tests specifically I'm not sure there is a huge amount of value in actually running the tests against cassandra -- we are using the official python module for it, and the test is basically running these queries - DROP TABLE IF EXISTS, CREATE TABLE, INSERT INTO TABLE, and then running hook.record_exists -- that seems like it's testing cassandra itself, when I think all we should do is test that hook.record_exists calls the execute method on the connection with the right string. I'll knock up a PR for this. Do we think it's worth keeping the non-mocked/integration tests too? -ash
On Jan 7 2020, at 9:50 am, Jarek Potiuk <[email protected]> wrote: > *TL;DR; I have a proposal how we can remedy often failing CI tests and I > have a kind request to other committers to help me to fix it in a good way.* > > As we all noticed we had recently some often (far too often) failing tests > in Travis. The situation is not very good and we have to remedy it fairly > quickly. > > I think we can do it without compromising the quality and without temporary > disabling some of the tests. > > *Root cause of the problem* > The root cause of the problem seems to be memory used during tests. After > adding instafail we know that often the tests are failing because there is > not enough memory on Travis machines. This is a combination of the way how > our virtual machines are allocated in Travis infrastructure, more and more > tests we have, the fact that our tests require a lot of "integrations" > (running as separate images - cassandra, rabbitmq, postgres/mysql databases > etc) and the fact that running them with pytests (pytest apparently uses > more memory). > > *Proposal* > One of the proposals on slack was to get rid of Cassandra tests and disable > cassandra temporarily - but I think we can do better and I can get it > merged in a day or two and get it sorted out for now (and good for the > future). > > I already wrote an integration test proposal recently > <https://lists.apache.org/thread.html/120af497f4adf482162be9583a93651fd206b71db213255f52ad8b7a%40%3Cdev.airflow.apache.org%3E> > (I > will resurrect that thread now) how we can split our integration tests > using pytest markers and get support from Breeze and our CI testing > framework into separate integrations. I already have working code for that > (it is a result of my resumed work on Production Image) and most of the > code is already Green in Travis and they need to get a review from other > committers. At the end of the message I copy the excerpt from the docs how > this will work. > > Once we have that in, we will have a very easy (and maintainable for the > future) way that helps both with CI resources but also make Breeze far more > usable (and less resource-hungry): > > - add pytest markers so that we know which tests are "integration" ones. > - start Breeze locally without any external integrations (right now only > Kerberos is needed) - most of the tests works there. Far less resource usage > - start Breeze easily with *--integration mongo --integration > cassandra *etc. whenever > we need to run tests for that integration > - run all the "non-integration" tests in CI without the integrations > started > - run only the "integration-related" tests in CI with the integrations > started > - we will have more jobs in CI but they should run much more reliably > and faster in general > - also one of the changes is to improve the way we build kind/kubernetes > tests in order to unblock migration to GithubActions that Tomek works on - > that might be our ultimate speedup/stabilisation > - For those curious ones is updated documentation in my PR: "Launching > Breeze integrations" > <https://github.com/PolideaInternal/airflow/blob/separate-integrations/BREEZE.rst#launching-breeze-integrations>, > "Running > <https://github.com/PolideaInternal/airflow/blob/separate-integrations/BREEZE.rst#running-tests-with-kubernetes-in-breeze> > tests with Kubernetes in Breeze" > <https://github.com/PolideaInternal/airflow/blob/separate-integrations/BREEZE.rst#running-tests-with-kubernetes-in-breeze> > > *PRs - kind request to other committers* > I have a series of PRs that are already implementing almost all of it (I > needed that in order to implement Production Image support). They are > depending on each other - I added unit test support for Bash scripts and > several improvements and added simplifications: > > - [AIRFLOW-6489] Add BATS support for Bash unit testing > <https://github.com/apache/airflow/pull/7081> [ready for review] - > needed to get more control over other changes. > - [AIRFLOW-6475] Remove duplication of volume mount specs in Breeze. > <https://github.com/apache/airflow/pull/7065>[ready for review]- > improves the consistency on how we run Breeze/CI > - [AIRFLOW-6491] improve parameter handling in breeze > <https://github.com/apache/airflow/pull/7084> [ready for review] - > tested and improved way how we handle --options in Breeze (needed for > Kubernetes improvements > - [AIRFLOW-5704] Improve Kind Kubernetes scripts for local testing > <https://github.com/apache/airflow/pull/6516>. - [testing] improve > handling Kubernetes Kind testing (fixes issues with loading images/ > upgrades kind to latest version). > - [AIRFLOW-6489] Separate integrations [WIP] > <https://github.com/apache/airflow/pull/7091> - [WIP] this is the test > introducing different integrations - it already works for Breeze and has > support for deciding which integrations should be started - I just need to > separate out the "Integration tests" to separate jobs. > > I have a kind request to other committers - can you please take a look at > those and help to merge it quickly? > > I have also follow-up PRs for production image, but the above PRs should > help us to solve the CI crisis in a good way and let me continue on the > prod image ones. > > J. > > -- > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> >
