Hi Jarek,

I like what you've done with the separate integrations, and that coupled with 
pytest markers and better "import error" handling in the tests would make it 
easier to run a sub-set of the tests without having to install everything (for 
instance not having to install mysql client libs. Admittedly less of a worry 
with breeze/docker, but still would be nice to skip/deselct tests when deps 
aren't there)
Most of these PRs are merged now, I've glanced over #7091 and like the look of 
it, good work! You'll let us know when we should take a deeper look?
For cassandra tests specifically I'm not sure there is a huge amount of value 
in actually running the tests against cassandra -- we are using the official 
python module for it, and the test is basically running these queries - DROP 
TABLE IF EXISTS, CREATE TABLE, INSERT INTO TABLE, and then running 
hook.record_exists -- that seems like it's testing cassandra itself, when I 
think all we should do is test that hook.record_exists calls the execute method 
on the connection with the right string. I'll knock up a PR for this.
Do we think it's worth keeping the non-mocked/integration tests too?
-ash

On Jan 7 2020, at 9:50 am, Jarek Potiuk <[email protected]> wrote:
> *TL;DR; I have a proposal how we can remedy often failing CI tests and I
> have a kind request to other committers to help me to fix it in a good way.*
>
> As we all noticed we had recently some often (far too often) failing tests
> in Travis. The situation is not very good and we have to remedy it fairly
> quickly.
>
> I think we can do it without compromising the quality and without temporary
> disabling some of the tests.
>
> *Root cause of the problem*
> The root cause of the problem seems to be memory used during tests. After
> adding instafail we know that often the tests are failing because there is
> not enough memory on Travis machines. This is a combination of the way how
> our virtual machines are allocated in Travis infrastructure, more and more
> tests we have, the fact that our tests require a lot of "integrations"
> (running as separate images - cassandra, rabbitmq, postgres/mysql databases
> etc) and the fact that running them with pytests (pytest apparently uses
> more memory).
>
> *Proposal*
> One of the proposals on slack was to get rid of Cassandra tests and disable
> cassandra temporarily - but I think we can do better and I can get it
> merged in a day or two and get it sorted out for now (and good for the
> future).
>
> I already wrote an integration test proposal recently
> <https://lists.apache.org/thread.html/120af497f4adf482162be9583a93651fd206b71db213255f52ad8b7a%40%3Cdev.airflow.apache.org%3E>
> (I
> will resurrect that thread now) how we can split our integration tests
> using pytest markers and get support from Breeze and our CI testing
> framework into separate integrations. I already have working code for that
> (it is a result of my resumed work on Production Image) and most of the
> code is already Green in Travis and they need to get a review from other
> committers. At the end of the message I copy the excerpt from the docs how
> this will work.
>
> Once we have that in, we will have a very easy (and maintainable for the
> future) way that helps both with CI resources but also make Breeze far more
> usable (and less resource-hungry):
>
> - add pytest markers so that we know which tests are "integration" ones.
> - start Breeze locally without any external integrations (right now only
> Kerberos is needed) - most of the tests works there. Far less resource usage
> - start Breeze easily with *--integration mongo --integration
> cassandra *etc. whenever
> we need to run tests for that integration
> - run all the "non-integration" tests in CI without the integrations
> started
> - run only the "integration-related" tests in CI with the integrations
> started
> - we will have more jobs in CI but they should run much more reliably
> and faster in general
> - also one of the changes is to improve the way we build kind/kubernetes
> tests in order to unblock migration to GithubActions that Tomek works on -
> that might be our ultimate speedup/stabilisation
> - For those curious ones is updated documentation in my PR: "Launching
> Breeze integrations"
> <https://github.com/PolideaInternal/airflow/blob/separate-integrations/BREEZE.rst#launching-breeze-integrations>,
> "Running
> <https://github.com/PolideaInternal/airflow/blob/separate-integrations/BREEZE.rst#running-tests-with-kubernetes-in-breeze>
> tests with Kubernetes in Breeze"
> <https://github.com/PolideaInternal/airflow/blob/separate-integrations/BREEZE.rst#running-tests-with-kubernetes-in-breeze>
>
> *PRs - kind request to other committers*
> I have a series of PRs that are already implementing almost all of it (I
> needed that in order to implement Production Image support). They are
> depending on each other - I added unit test support for Bash scripts and
> several improvements and added simplifications:
>
> - [AIRFLOW-6489] Add BATS support for Bash unit testing
> <https://github.com/apache/airflow/pull/7081> [ready for review] -
> needed to get more control over other changes.
> - [AIRFLOW-6475] Remove duplication of volume mount specs in Breeze.
> <https://github.com/apache/airflow/pull/7065>[ready for review]-
> improves the consistency on how we run Breeze/CI
> - [AIRFLOW-6491] improve parameter handling in breeze
> <https://github.com/apache/airflow/pull/7084> [ready for review] -
> tested and improved way how we handle --options in Breeze (needed for
> Kubernetes improvements
> - [AIRFLOW-5704] Improve Kind Kubernetes scripts for local testing
> <https://github.com/apache/airflow/pull/6516>. - [testing] improve
> handling Kubernetes Kind testing (fixes issues with loading images/
> upgrades kind to latest version).
> - [AIRFLOW-6489] Separate integrations [WIP]
> <https://github.com/apache/airflow/pull/7091> - [WIP] this is the test
> introducing different integrations - it already works for Breeze and has
> support for deciding which integrations should be started - I just need to
> separate out the "Integration tests" to separate jobs.
>
> I have a kind request to other committers - can you please take a look at
> those and help to merge it quickly?
>
> I have also follow-up PRs for production image, but the above PRs should
> help us to solve the CI crisis in a good way and let me continue on the
> prod image ones.
>
> J.
>
> --
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Reply via email to