On Sat, Feb 15, 2020, 20:54 Ash Berlin-Taylor <a...@apache.org> wrote:

> Yeah, I'm for this.
>
> In fact I'm about to mark some of the Hive ones as system tests as they
> require a running hive cluster.
> I would be careful about which automatically marking unit tests that run
> dags as system/integration though, a number of our unit tests rely on this
> to test the tasks in various states in the parts of the scheduler. Ideally
> they wouldn't, but right now they do, and the tasks they run are of the
> DummyOperator or "bash_command=date" flavour.
>

Agree with Ash here, I think for this we should have yet another category
'dag tests' 'core tests' ? - those are indeed run using dags run with the
whole Airflow underneath but their purpose I to yes Airvlow Cor not
external systems


j.


-ash
> On Feb 15 2020, at 7:28 pm, Tomasz Urbaszek <turbas...@apache.org> wrote:
> > +1 for introducing system tests. Lack of them is a big pain.
> >
> > I would like also to suggest to mark some actual tests (those running
> > DAGs, etc) as system tests. Then we can simplify our units and
> > probably speed up CI builds (not to mention the reduction of side
> > effects). The approach used for GCP system tests that runs an example
> > DAG makes creating such tests really easy (or we can generate them
> > automatically...).I

>
> > Regarding the frequency of such tests, I think we should run all of
> > them daily on master. Or run them when there is a change in specific
> > files (operators / hooks etc).
> >
> > Tomek
> >
> > On Sat, Feb 15, 2020 at 1:15 PM Jarek Potiuk <jarek.pot...@polidea.com>
> wrote:
> > >
> > > TL;DR; I would like to revive a discussion (hopefully short :) and
> possibly
> > > cast a vote on "AIP-4 - Support for System Tests for external systems".
> > >
> > >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems
> > > This is the very first AIP I created almost 1.5 years ago and it took
> very
> > > long to get to the point where I think we are very, very close to being
> > > able to implement it after many, many baby steps (and some bigger
> leaps)
> > > that we've done in the meantime.
> > >
> > > *Let me just quickly summarise what is the context:*
> > > - One of the biggest Airflow advantages are integrations with external
> > > systems. We have i think several 100s of hooks and operators working
> with
> > > those external systems
> > > - We have an extensive set of tests - both unit and integrations that
> > > are sometimes really good and catching a lot of problems, but they can
> only
> > > do as much as mocking out access to the external systems.
> > > Unit/integration tests are great for testing the core of Airflow and
> it's
> > > functionality but the external services cannot be effectively tested
> > > - The externa services sometimes change - we have new versions of
> tools,
> > > services etc released every day and sometimes even if we perfectly
> mock it
> > > in unit tests - the hooks simply stop working at some point in time.
> > > - I think there is a need to run some tests on a systems level
> regularly
> > > - communicating with "real" external systems and testing our operators,
> > > Let's call them System Tests. They do not necessarily need to be run
> with
> > > every PR, but I think running them regularly makes perfect sense.
> > >
> > >
> > > *Why now? Why this seems to be a good time to do it?*
> > > - We switched to pytests and we already have separation to
> > > unit/integration tests in place - we can add support to system tests
> using
> > > the same mechanisms.
> > > - With AIP-21 we grouped the tests into "providers" package and that
> > > makes it easy to define boundaries of "systems" - every provider is a
> > > "system" to test.
> > > - We have plenty of system tests implemented for GCP which we are going
> > > to use to run tests for backported packages from AIP-21 - we followed
> > > system test automation for more than a year in GCP operators and we
> have it
> > > fully automated already.
> > > - In the latest PR - https://github.com/apache/airflow/pull/7389 we
> even
> > > extracted all the GCP-specific way we run system tests in the way to a)
> > > make it easy for everyone to write automated system tests b) make it
> > > possible to be automated.
> > > - We have credits provided by Google to run our tests and we can use
> > > them for regular runs of the system tests
> > > - We are close to switch-over to GitHub Actions, which will make it
> easy
> > > to write manually or regularly scheduled actions that will have
> securely
> > > stored credentials to run the system tests - in a way that it will be
> > > controlled by committers and not abusable by contributors who prepare
> PRs.
> > > - I would like to start and lead a community-driven effort where we
> will
> > > split amongst community members writing missing tests - so that our new
> > > backport packages can be tested against latest-released version of
> 1.10.*.
> > > We will provide GCP tests as examples, we will also setup the
> automation
> > > needed to run the tests regularly - the only thing we will ask the
> members
> > > of the community is to write missing tests. This way I hope we can get
> very
> > > high coverage of backported packages.
> > >
> > > There are of course still a number of open questions - like how to
> store
> > > credentials, how often to run the tests etc. but I think those are
> > > implementation details that we can work out while we are implementing
> it.
> > >
> > > What do you think about it? If I have a lot of "yes's" quickly, I would
> > > love to start voting on AIP-4.
> > >
> > > J.
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> >
> >
>
>

Reply via email to