Re: [DISCUSS] AIP-4 System tests

Jarek Potiuk Fri, 21 Feb 2020 05:51:23 -0800

Any more comments for system tests? I would love to vote on the AIP-4 and
my current proposal would be :


1) Let's try to automate system test execution (starting with GCP as it is
close to be ready). That would most likely be with Github Actions -
details to be worked on.
2) We can do it to automate testing of Backport operators (which
complete AIP-21)
3) We can build it in the way that other provider's tests can be executed
automatically as well, providing that there is a contribution with system
tests.

WDYT ?


J.


On Sat, Feb 15, 2020 at 8:59 PM Jarek Potiuk <[email protected]>
wrote:

>
>
> On Sat, Feb 15, 2020, 20:54 Ash Berlin-Taylor <[email protected]> wrote:
>
>> Yeah, I'm for this.
>>
>> In fact I'm about to mark some of the Hive ones as system tests as they
>> require a running hive cluster.
>> I would be careful about which automatically marking unit tests that run
>> dags as system/integration though, a number of our unit tests rely on this
>> to test the tasks in various states in the parts of the scheduler. Ideally
>> they wouldn't, but right now they do, and the tasks they run are of the
>> DummyOperator or "bash_command=date" flavour.
>>
>
> Agree with Ash here, I think for this we should have yet another category
> 'dag tests' 'core tests' ? - those are indeed run using dags run with the
> whole Airflow underneath but their purpose I to yes Airvlow Cor not
> external systems
>
>
> j.
>
>
> -ash
>> On Feb 15 2020, at 7:28 pm, Tomasz Urbaszek <[email protected]> wrote:
>> > +1 for introducing system tests. Lack of them is a big pain.
>> >
>> > I would like also to suggest to mark some actual tests (those running
>> > DAGs, etc) as system tests. Then we can simplify our units and
>> > probably speed up CI builds (not to mention the reduction of side
>> > effects). The approach used for GCP system tests that runs an example
>> > DAG makes creating such tests really easy (or we can generate them
>> > automatically...).I
>
> >
>> > Regarding the frequency of such tests, I think we should run all of
>> > them daily on master. Or run them when there is a change in specific
>> > files (operators / hooks etc).
>> >
>> > Tomek
>> >
>> > On Sat, Feb 15, 2020 at 1:15 PM Jarek Potiuk <[email protected]>
>> wrote:
>> > >
>> > > TL;DR; I would like to revive a discussion (hopefully short :) and
>> possibly
>> > > cast a vote on "AIP-4 - Support for System Tests for external
>> systems".
>> > >
>> > >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems
>> > > This is the very first AIP I created almost 1.5 years ago and it took
>> very
>> > > long to get to the point where I think we are very, very close to
>> being
>> > > able to implement it after many, many baby steps (and some bigger
>> leaps)
>> > > that we've done in the meantime.
>> > >
>> > > *Let me just quickly summarise what is the context:*
>> > > - One of the biggest Airflow advantages are integrations with external
>> > > systems. We have i think several 100s of hooks and operators working
>> with
>> > > those external systems
>> > > - We have an extensive set of tests - both unit and integrations that
>> > > are sometimes really good and catching a lot of problems, but they
>> can only
>> > > do as much as mocking out access to the external systems.
>> > > Unit/integration tests are great for testing the core of Airflow and
>> it's
>> > > functionality but the external services cannot be effectively tested
>> > > - The externa services sometimes change - we have new versions of
>> tools,
>> > > services etc released every day and sometimes even if we perfectly
>> mock it
>> > > in unit tests - the hooks simply stop working at some point in time.
>> > > - I think there is a need to run some tests on a systems level
>> regularly
>> > > - communicating with "real" external systems and testing our
>> operators,
>> > > Let's call them System Tests. They do not necessarily need to be run
>> with
>> > > every PR, but I think running them regularly makes perfect sense.
>> > >
>> > >
>> > > *Why now? Why this seems to be a good time to do it?*
>> > > - We switched to pytests and we already have separation to
>> > > unit/integration tests in place - we can add support to system tests
>> using
>> > > the same mechanisms.
>> > > - With AIP-21 we grouped the tests into "providers" package and that
>> > > makes it easy to define boundaries of "systems" - every provider is a
>> > > "system" to test.
>> > > - We have plenty of system tests implemented for GCP which we are
>> going
>> > > to use to run tests for backported packages from AIP-21 - we followed
>> > > system test automation for more than a year in GCP operators and we
>> have it
>> > > fully automated already.
>> > > - In the latest PR - https://github.com/apache/airflow/pull/7389 we
>> even
>> > > extracted all the GCP-specific way we run system tests in the way to
>> a)
>> > > make it easy for everyone to write automated system tests b) make it
>> > > possible to be automated.
>> > > - We have credits provided by Google to run our tests and we can use
>> > > them for regular runs of the system tests
>> > > - We are close to switch-over to GitHub Actions, which will make it
>> easy
>> > > to write manually or regularly scheduled actions that will have
>> securely
>> > > stored credentials to run the system tests - in a way that it will be
>> > > controlled by committers and not abusable by contributors who prepare
>> PRs.
>> > > - I would like to start and lead a community-driven effort where we
>> will
>> > > split amongst community members writing missing tests - so that our
>> new
>> > > backport packages can be tested against latest-released version of
>> 1.10.*.
>> > > We will provide GCP tests as examples, we will also setup the
>> automation
>> > > needed to run the tests regularly - the only thing we will ask the
>> members
>> > > of the community is to write missing tests. This way I hope we can
>> get very
>> > > high coverage of backported packages.
>> > >
>> > > There are of course still a number of open questions - like how to
>> store
>> > > credentials, how often to run the tests etc. but I think those are
>> > > implementation details that we can work out while we are implementing
>> it.
>> > >
>> > > What do you think about it? If I have a lot of "yes's" quickly, I
>> would
>> > > love to start voting on AIP-4.
>> > >
>> > > J.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Jarek Potiuk
>> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > >
>> > > M: +48 660 796 129 <+48660796129>
>> > > [image: Polidea] <https://www.polidea.com/>
>> >
>> >
>>
>>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: [DISCUSS] AIP-4 System tests

Reply via email to