Re: Often failing tests in CI (and a way to fix them quickly and future-proof)

Jarek Potiuk Tue, 14 Jan 2020 08:25:18 -0800

We have other kerberos-related failures. I disabled
temporarily "kerberos-specific" build until I add some more diagnostics and
test it.


Please rebase to latest master.

J.

On Tue, Jan 14, 2020 at 7:24 AM Jarek Potiuk <[email protected]>
wrote:

> Seems tests are stable  - but the kerberos problem is happening often
> enough to take a look. I will see what I can do to make it stable. seems
> that might be a race between kerberos initialising and tests starting to
> run.
>
> On Mon, Jan 13, 2020 at 8:58 PM Jarek Potiuk <[email protected]>
> wrote:
>
>> Just merged the change with integration separation/slimming down the
>> tests on CI. https://github.com/apache/airflow/pull/7091
>>
>> It looks like it is far more stable, I just had one failure with kerberos
>> not starting (which also happened sometimes with old tests). We will look
>> in the future at some of the "xfailed/xpassed" tests - those that we know
>> are problematic. We have 8 of them now.
>>
>> Also Breeze is now much more enjoyable to use. Pls. take a look at the
>> docs.
>>
>> J.
>>
>> On Wed, Jan 8, 2020 at 2:23 PM Jarek Potiuk <[email protected]>
>> wrote:
>>
>>> I like what you've done with the separate integrations, and that coupled
>>>> with pytest markers and better "import error" handling in the tests would
>>>> make it easier to run a sub-set of the tests without having to install
>>>> everything (for instance not having to install mysql client libs.
>>>
>>>
>>> Cool. That's exactly what I am working on in
>>> https://github.com/apache/airflow/pull/7091 -> I want to get all the
>>> tests run in integration-less CI, select all those that failed and treat
>>> them appropriately.
>>>
>>>
>>>> Admittedly less of a worry with breeze/docker, but still would be nice
>>>> to skip/deselct tests when deps aren't there)
>>>>
>>>
>>> Yeah. For me it's the same. I think we had recently a few discussions
>>> with first time users that they have difficulty contributing because they
>>> do not know how to reproduce failing CI reliably locally. I think the
>>> resource of Breeze environment for simple tests was a big
>>> blocker/difficulty for some users so slimming it down and making it
>>> integration-less by default will be really helpful. I will also make it the
>>> "default" way of reproducing tests - i will remove the separate bash
>>> scripts which were an intermediate step. This is the same work especially
>>> that I use the same mechanism and ... well - it will be far easier for me
>>> to have integration - specific cases working in CI  if i also have Breeze
>>> to support it (eating my own dog food).
>>>
>>>
>>>> Most of these PRs are merged now, I've glanced over #7091 and like the
>>>> look of it, good work! You'll let us know when we should take a deeper 
>>>> look?
>>>>
>>>
>>> Yep I will. I hope today/tomorrow - most of it is ready. I also managed
>>> to VASTLY simplified running kubernetes kind (One less docker image,
>>> everything runs in the same docker engine as the airflow-testing itself) in
>>> https://github.com/apache/airflow/pull/6516 which is prerequisite for
>>> #7091  - so both will need to be reviewed. I marke
>>>
>>>
>>>> For cassandra tests specifically I'm not sure there is a huge amount of
>>>> value in actually running the tests against cassandra -- we are using the
>>>> official python module for it, and the test is basically running these
>>>> queries - DROP TABLE IF EXISTS, CREATE TABLE, INSERT INTO TABLE, and then
>>>> running hook.record_exists -- that seems like it's testing cassandra
>>>> itself, when I think all we should do is test that hook.record_exists calls
>>>> the execute method on the connection with the right string. I'll knock up a
>>>> PR for this.
>>>> Do we think it's worth keeping the non-mocked/integration tests too?
>>>>
>>>
>>> I would not remove them just yet. Let's see how it works when I separate
>>> it out. I have a feeling that we have very little number of those
>>> integration tests overall so maybe it will be stable and fast enough when
>>> we only run those in a separate job. I think it's good to have different
>>> levels of tests (unit/integration/system) as they find different types of
>>> problems.  As long as we can have integration/system tests clearly
>>> separated, stable and easy to disable/enable - I am all for having
>>> different types of tests. There is this old and well established concept of
>>> Test Pyramid https://martinfowler.com/bliki/TestPyramid.html  which
>>> applies very accurately to our case. By adding markers/categorising the
>>> tests and seeing how many of those tests we have, how stable they are, how
>>> long they are and (eventtually) how much it costs us - we can make better
>>> decisions.
>>>
>>> J.
>>>
>>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Often failing tests in CI (and a way to fix them quickly and future-proof)

Reply via email to