> When these system tests would be run? With every PR?
>

I think we'll have to decide when we see how long the tests run. First I
want to see if this is feasible and whether we can automate it. After we do
it - we can decide how often we run it (when we see how long it takes and
how stable they are). I think we have a range of options (after PR, daily,
weekly....)  and we need more actual observations to see how big is the
cost of running the tests and how long it takes to run them.

I don't think we know all the answers now  - we are still in the
"POC"/"Discussion" phase and I want to gather some more data while doing
it. It's a bit of an experiment so that we can make more informed
far-reaching decisions.

Best regards

> Michał
>
>
>
>
> On Wed, Mar 4, 2020 at 2:25 PM Jarek Potiuk <jarek.pot...@polidea.com>
> wrote:
>
> > Hello One small update.
> >
> > We are trying now with Bjorn Olsen to see how well the System Tests
> > approach we did for Google Cloud Platform can be applied to AWS. This
> > might be a good exercise to see if we can apply it to other services and
> > make it part of releasing backport operators, fully automating it with
> > AIP-4  and later it can be a good start for AIP-8 (separate providers).
> >
> > I created a #system-tests channel in Slack so - anyone interested in the
> > subject is welcome. Also if anyone would like to implement and test
> system
> > tests for any of the providers. You are welcome to join!. This is a
> > current list of providers we have. Some of them are super simple. Some
> more
> > complex. With "google" we are going to address by far the biggest one :).
> >
> >
> > {
> >         "amazon": [setup.aws],
> >         "apache.cassandra": [setup.cassandra],
> >         "apache.druid": [setup.druid],
> >         "apache.hdfs": [setup.hdfs],
> >         "apache.hive": [setup.hive],
> >         "apache.pig": [],
> >         "apache.pinot": [setup.pinot],
> >         "apache.spark": [],
> >         "apache.sqoop": [],
> >         "celery": [setup.celery],
> >         "cloudant": [setup.cloudant],
> >         "cncf.kubernetes": [setup.kubernetes],
> >         "databricks": [setup.databricks],
> >         "datadog": [setup.datadog],
> >         "dingding": [],
> >         "discord": [],
> >         "docker": [setup.docker],
> >         "email": [],
> >         "ftp": [],
> >         "google.cloud": [setup.gcp],
> >         "google.marketing_platform": [setup.gcp],
> >         "google.suite": [setup.gcp],
> >         "grpc": [setup.grpc],
> >         "http": [],
> >         "imap": [],
> >         "jdbc": [setup.jdbc],
> >         "jenkins": [setup.jenkins],
> >         "jira": [setup.jira],
> >         "microsoft.azure": [setup.azure],
> >         "microsoft.mssql": [setup.mssql],
> >         "microsoft.winrm": [setup.winrm],
> >         "mongo": [setup.mongo],
> >         "mysql": [setup.mysql],
> >         "odbc": [setup.odbc],
> >         "openfass": [],
> >         "opsgenie": [],
> >         "oracle": [setup.oracle],
> >         "pagerduty": [setup.pagerduty],
> >         "papermill": [setup.papermill],
> >         "postgres": [setup.postgres],
> >         "presto": [setup.presto],
> >         "qubole": [setup.qds],
> >         "redis": [setup.redis],
> >         "salesforce": [setup.salesforce],
> >         "samba": [setup.samba],
> >         "segment": [setup.segment],
> >         "sftp": [setup.ssh],
> >         "slack": [setup.slack],
> >         "snowflake": [setup.snowflake],
> >         "sqlite": [],
> >         "ssh": [setup.ssh],
> >         "vertica": [setup.vertica],
> >         "zendesk": [setup.zendesk],
> > }
> >
> >
> >
> >
> > J.
> >
> >
> > On Fri, Feb 21, 2020 at 2:50 PM Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> >
> > > Any more comments for system tests? I would love to vote on the AIP-4
> and
> > > my current proposal would be :
> > >
> > > 1) Let's try to automate system test execution (starting with GCP as it
> > is
> > > close to be ready). That would most likely be with Github Actions -
> > > details to be worked on.
> > > 2) We can do it to automate testing of Backport operators (which
> > > complete AIP-21)
> > > 3) We can build it in the way that other provider's tests can be
> executed
> > > automatically as well, providing that there is a contribution with
> system
> > > tests.
> > >
> > > WDYT ?
> > >
> > >
> > > J.
> > >
> > >
> > > On Sat, Feb 15, 2020 at 8:59 PM Jarek Potiuk <jarek.pot...@polidea.com
> >
> > > wrote:
> > >
> > >>
> > >>
> > >> On Sat, Feb 15, 2020, 20:54 Ash Berlin-Taylor <a...@apache.org> wrote:
> > >>
> > >>> Yeah, I'm for this.
> > >>>
> > >>> In fact I'm about to mark some of the Hive ones as system tests as
> they
> > >>> require a running hive cluster.
> > >>> I would be careful about which automatically marking unit tests that
> > run
> > >>> dags as system/integration though, a number of our unit tests rely on
> > this
> > >>> to test the tasks in various states in the parts of the scheduler.
> > Ideally
> > >>> they wouldn't, but right now they do, and the tasks they run are of
> the
> > >>> DummyOperator or "bash_command=date" flavour.
> > >>>
> > >>
> > >> Agree with Ash here, I think for this we should have yet another
> > category
> > >> 'dag tests' 'core tests' ? - those are indeed run using dags run with
> > the
> > >> whole Airflow underneath but their purpose I to yes Airvlow Cor not
> > >> external systems
> > >>
> > >>
> > >> j.
> > >>
> > >>
> > >> -ash
> > >>> On Feb 15 2020, at 7:28 pm, Tomasz Urbaszek <turbas...@apache.org>
> > >>> wrote:
> > >>> > +1 for introducing system tests. Lack of them is a big pain.
> > >>> >
> > >>> > I would like also to suggest to mark some actual tests (those
> running
> > >>> > DAGs, etc) as system tests. Then we can simplify our units and
> > >>> > probably speed up CI builds (not to mention the reduction of side
> > >>> > effects). The approach used for GCP system tests that runs an
> example
> > >>> > DAG makes creating such tests really easy (or we can generate them
> > >>> > automatically...).I
> > >>
> > >> >
> > >>> > Regarding the frequency of such tests, I think we should run all of
> > >>> > them daily on master. Or run them when there is a change in
> specific
> > >>> > files (operators / hooks etc).
> > >>> >
> > >>> > Tomek
> > >>> >
> > >>> > On Sat, Feb 15, 2020 at 1:15 PM Jarek Potiuk <
> > jarek.pot...@polidea.com>
> > >>> wrote:
> > >>> > >
> > >>> > > TL;DR; I would like to revive a discussion (hopefully short :)
> and
> > >>> possibly
> > >>> > > cast a vote on "AIP-4 - Support for System Tests for external
> > >>> systems".
> > >>> > >
> > >>> > >
> > >>>
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems
> > >>> > > This is the very first AIP I created almost 1.5 years ago and it
> > >>> took very
> > >>> > > long to get to the point where I think we are very, very close to
> > >>> being
> > >>> > > able to implement it after many, many baby steps (and some bigger
> > >>> leaps)
> > >>> > > that we've done in the meantime.
> > >>> > >
> > >>> > > *Let me just quickly summarise what is the context:*
> > >>> > > - One of the biggest Airflow advantages are integrations with
> > >>> external
> > >>> > > systems. We have i think several 100s of hooks and operators
> > working
> > >>> with
> > >>> > > those external systems
> > >>> > > - We have an extensive set of tests - both unit and integrations
> > that
> > >>> > > are sometimes really good and catching a lot of problems, but
> they
> > >>> can only
> > >>> > > do as much as mocking out access to the external systems.
> > >>> > > Unit/integration tests are great for testing the core of Airflow
> > and
> > >>> it's
> > >>> > > functionality but the external services cannot be effectively
> > tested
> > >>> > > - The externa services sometimes change - we have new versions of
> > >>> tools,
> > >>> > > services etc released every day and sometimes even if we
> perfectly
> > >>> mock it
> > >>> > > in unit tests - the hooks simply stop working at some point in
> > time.
> > >>> > > - I think there is a need to run some tests on a systems level
> > >>> regularly
> > >>> > > - communicating with "real" external systems and testing our
> > >>> operators,
> > >>> > > Let's call them System Tests. They do not necessarily need to be
> > run
> > >>> with
> > >>> > > every PR, but I think running them regularly makes perfect sense.
> > >>> > >
> > >>> > >
> > >>> > > *Why now? Why this seems to be a good time to do it?*
> > >>> > > - We switched to pytests and we already have separation to
> > >>> > > unit/integration tests in place - we can add support to system
> > tests
> > >>> using
> > >>> > > the same mechanisms.
> > >>> > > - With AIP-21 we grouped the tests into "providers" package and
> > that
> > >>> > > makes it easy to define boundaries of "systems" - every provider
> > is a
> > >>> > > "system" to test.
> > >>> > > - We have plenty of system tests implemented for GCP which we are
> > >>> going
> > >>> > > to use to run tests for backported packages from AIP-21 - we
> > followed
> > >>> > > system test automation for more than a year in GCP operators and
> we
> > >>> have it
> > >>> > > fully automated already.
> > >>> > > - In the latest PR - https://github.com/apache/airflow/pull/7389
> > we
> > >>> even
> > >>> > > extracted all the GCP-specific way we run system tests in the way
> > to
> > >>> a)
> > >>> > > make it easy for everyone to write automated system tests b) make
> > it
> > >>> > > possible to be automated.
> > >>> > > - We have credits provided by Google to run our tests and we can
> > use
> > >>> > > them for regular runs of the system tests
> > >>> > > - We are close to switch-over to GitHub Actions, which will make
> it
> > >>> easy
> > >>> > > to write manually or regularly scheduled actions that will have
> > >>> securely
> > >>> > > stored credentials to run the system tests - in a way that it
> will
> > be
> > >>> > > controlled by committers and not abusable by contributors who
> > >>> prepare PRs.
> > >>> > > - I would like to start and lead a community-driven effort where
> we
> > >>> will
> > >>> > > split amongst community members writing missing tests - so that
> our
> > >>> new
> > >>> > > backport packages can be tested against latest-released version
> of
> > >>> 1.10.*.
> > >>> > > We will provide GCP tests as examples, we will also setup the
> > >>> automation
> > >>> > > needed to run the tests regularly - the only thing we will ask
> the
> > >>> members
> > >>> > > of the community is to write missing tests. This way I hope we
> can
> > >>> get very
> > >>> > > high coverage of backported packages.
> > >>> > >
> > >>> > > There are of course still a number of open questions - like how
> to
> > >>> store
> > >>> > > credentials, how often to run the tests etc. but I think those
> are
> > >>> > > implementation details that we can work out while we are
> > >>> implementing it.
> > >>> > >
> > >>> > > What do you think about it? If I have a lot of "yes's" quickly, I
> > >>> would
> > >>> > > love to start voting on AIP-4.
> > >>> > >
> > >>> > > J.
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > > --
> > >>> > > Jarek Potiuk
> > >>> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >>> > >
> > >>> > > M: +48 660 796 129 <+48660796129>
> > >>> > > [image: Polidea] <https://www.polidea.com/>
> > >>> >
> > >>> >
> > >>>
> > >>>
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> > >
> > >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>
>
> --
>
> Michał Słowikowski
> Polidea <https://www.polidea.com/> | Junior Software Engineer
>
> E: michal.slowikow...@polidea.com
>
> Unique Tech
> Check out our projects! <https://www.polidea.com/our-work>
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to