> > I like what you've done with the separate integrations, and that coupled > with pytest markers and better "import error" handling in the tests would > make it easier to run a sub-set of the tests without having to install > everything (for instance not having to install mysql client libs.
Cool. That's exactly what I am working on in https://github.com/apache/airflow/pull/7091 -> I want to get all the tests run in integration-less CI, select all those that failed and treat them appropriately. > Admittedly less of a worry with breeze/docker, but still would be nice to > skip/deselct tests when deps aren't there) > Yeah. For me it's the same. I think we had recently a few discussions with first time users that they have difficulty contributing because they do not know how to reproduce failing CI reliably locally. I think the resource of Breeze environment for simple tests was a big blocker/difficulty for some users so slimming it down and making it integration-less by default will be really helpful. I will also make it the "default" way of reproducing tests - i will remove the separate bash scripts which were an intermediate step. This is the same work especially that I use the same mechanism and ... well - it will be far easier for me to have integration - specific cases working in CI if i also have Breeze to support it (eating my own dog food). > Most of these PRs are merged now, I've glanced over #7091 and like the > look of it, good work! You'll let us know when we should take a deeper look? > Yep I will. I hope today/tomorrow - most of it is ready. I also managed to VASTLY simplified running kubernetes kind (One less docker image, everything runs in the same docker engine as the airflow-testing itself) in https://github.com/apache/airflow/pull/6516 which is prerequisite for #7091 - so both will need to be reviewed. I marke > For cassandra tests specifically I'm not sure there is a huge amount of > value in actually running the tests against cassandra -- we are using the > official python module for it, and the test is basically running these > queries - DROP TABLE IF EXISTS, CREATE TABLE, INSERT INTO TABLE, and then > running hook.record_exists -- that seems like it's testing cassandra > itself, when I think all we should do is test that hook.record_exists calls > the execute method on the connection with the right string. I'll knock up a > PR for this. > Do we think it's worth keeping the non-mocked/integration tests too? > I would not remove them just yet. Let's see how it works when I separate it out. I have a feeling that we have very little number of those integration tests overall so maybe it will be stable and fast enough when we only run those in a separate job. I think it's good to have different levels of tests (unit/integration/system) as they find different types of problems. As long as we can have integration/system tests clearly separated, stable and easy to disable/enable - I am all for having different types of tests. There is this old and well established concept of Test Pyramid https://martinfowler.com/bliki/TestPyramid.html which applies very accurately to our case. By adding markers/categorising the tests and seeing how many of those tests we have, how stable they are, how long they are and (eventtually) how much it costs us - we can make better decisions. J.
