+1 from me to make more system tests. I do not have much experience with such tests but I am going to fix it, right away. :) When these system tests would be run? With every PR? I don't know if such tool will be helpful but I leave the link here. The description says: "This is a pytest plug-in which automatically selects and re-executes only tests affected by recent changes. " https://pypi.org/project/pytest-testmon/
Best regards Michał On Wed, Mar 4, 2020 at 2:25 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > Hello One small update. > > We are trying now with Bjorn Olsen to see how well the System Tests > approach we did for Google Cloud Platform can be applied to AWS. This > might be a good exercise to see if we can apply it to other services and > make it part of releasing backport operators, fully automating it with > AIP-4 and later it can be a good start for AIP-8 (separate providers). > > I created a #system-tests channel in Slack so - anyone interested in the > subject is welcome. Also if anyone would like to implement and test system > tests for any of the providers. You are welcome to join!. This is a > current list of providers we have. Some of them are super simple. Some more > complex. With "google" we are going to address by far the biggest one :). > > > { > "amazon": [setup.aws], > "apache.cassandra": [setup.cassandra], > "apache.druid": [setup.druid], > "apache.hdfs": [setup.hdfs], > "apache.hive": [setup.hive], > "apache.pig": [], > "apache.pinot": [setup.pinot], > "apache.spark": [], > "apache.sqoop": [], > "celery": [setup.celery], > "cloudant": [setup.cloudant], > "cncf.kubernetes": [setup.kubernetes], > "databricks": [setup.databricks], > "datadog": [setup.datadog], > "dingding": [], > "discord": [], > "docker": [setup.docker], > "email": [], > "ftp": [], > "google.cloud": [setup.gcp], > "google.marketing_platform": [setup.gcp], > "google.suite": [setup.gcp], > "grpc": [setup.grpc], > "http": [], > "imap": [], > "jdbc": [setup.jdbc], > "jenkins": [setup.jenkins], > "jira": [setup.jira], > "microsoft.azure": [setup.azure], > "microsoft.mssql": [setup.mssql], > "microsoft.winrm": [setup.winrm], > "mongo": [setup.mongo], > "mysql": [setup.mysql], > "odbc": [setup.odbc], > "openfass": [], > "opsgenie": [], > "oracle": [setup.oracle], > "pagerduty": [setup.pagerduty], > "papermill": [setup.papermill], > "postgres": [setup.postgres], > "presto": [setup.presto], > "qubole": [setup.qds], > "redis": [setup.redis], > "salesforce": [setup.salesforce], > "samba": [setup.samba], > "segment": [setup.segment], > "sftp": [setup.ssh], > "slack": [setup.slack], > "snowflake": [setup.snowflake], > "sqlite": [], > "ssh": [setup.ssh], > "vertica": [setup.vertica], > "zendesk": [setup.zendesk], > } > > > > > J. > > > On Fri, Feb 21, 2020 at 2:50 PM Jarek Potiuk <jarek.pot...@polidea.com> > wrote: > > > Any more comments for system tests? I would love to vote on the AIP-4 and > > my current proposal would be : > > > > 1) Let's try to automate system test execution (starting with GCP as it > is > > close to be ready). That would most likely be with Github Actions - > > details to be worked on. > > 2) We can do it to automate testing of Backport operators (which > > complete AIP-21) > > 3) We can build it in the way that other provider's tests can be executed > > automatically as well, providing that there is a contribution with system > > tests. > > > > WDYT ? > > > > > > J. > > > > > > On Sat, Feb 15, 2020 at 8:59 PM Jarek Potiuk <jarek.pot...@polidea.com> > > wrote: > > > >> > >> > >> On Sat, Feb 15, 2020, 20:54 Ash Berlin-Taylor <a...@apache.org> wrote: > >> > >>> Yeah, I'm for this. > >>> > >>> In fact I'm about to mark some of the Hive ones as system tests as they > >>> require a running hive cluster. > >>> I would be careful about which automatically marking unit tests that > run > >>> dags as system/integration though, a number of our unit tests rely on > this > >>> to test the tasks in various states in the parts of the scheduler. > Ideally > >>> they wouldn't, but right now they do, and the tasks they run are of the > >>> DummyOperator or "bash_command=date" flavour. > >>> > >> > >> Agree with Ash here, I think for this we should have yet another > category > >> 'dag tests' 'core tests' ? - those are indeed run using dags run with > the > >> whole Airflow underneath but their purpose I to yes Airvlow Cor not > >> external systems > >> > >> > >> j. > >> > >> > >> -ash > >>> On Feb 15 2020, at 7:28 pm, Tomasz Urbaszek <turbas...@apache.org> > >>> wrote: > >>> > +1 for introducing system tests. Lack of them is a big pain. > >>> > > >>> > I would like also to suggest to mark some actual tests (those running > >>> > DAGs, etc) as system tests. Then we can simplify our units and > >>> > probably speed up CI builds (not to mention the reduction of side > >>> > effects). The approach used for GCP system tests that runs an example > >>> > DAG makes creating such tests really easy (or we can generate them > >>> > automatically...).I > >> > >> > > >>> > Regarding the frequency of such tests, I think we should run all of > >>> > them daily on master. Or run them when there is a change in specific > >>> > files (operators / hooks etc). > >>> > > >>> > Tomek > >>> > > >>> > On Sat, Feb 15, 2020 at 1:15 PM Jarek Potiuk < > jarek.pot...@polidea.com> > >>> wrote: > >>> > > > >>> > > TL;DR; I would like to revive a discussion (hopefully short :) and > >>> possibly > >>> > > cast a vote on "AIP-4 - Support for System Tests for external > >>> systems". > >>> > > > >>> > > > >>> > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems > >>> > > This is the very first AIP I created almost 1.5 years ago and it > >>> took very > >>> > > long to get to the point where I think we are very, very close to > >>> being > >>> > > able to implement it after many, many baby steps (and some bigger > >>> leaps) > >>> > > that we've done in the meantime. > >>> > > > >>> > > *Let me just quickly summarise what is the context:* > >>> > > - One of the biggest Airflow advantages are integrations with > >>> external > >>> > > systems. We have i think several 100s of hooks and operators > working > >>> with > >>> > > those external systems > >>> > > - We have an extensive set of tests - both unit and integrations > that > >>> > > are sometimes really good and catching a lot of problems, but they > >>> can only > >>> > > do as much as mocking out access to the external systems. > >>> > > Unit/integration tests are great for testing the core of Airflow > and > >>> it's > >>> > > functionality but the external services cannot be effectively > tested > >>> > > - The externa services sometimes change - we have new versions of > >>> tools, > >>> > > services etc released every day and sometimes even if we perfectly > >>> mock it > >>> > > in unit tests - the hooks simply stop working at some point in > time. > >>> > > - I think there is a need to run some tests on a systems level > >>> regularly > >>> > > - communicating with "real" external systems and testing our > >>> operators, > >>> > > Let's call them System Tests. They do not necessarily need to be > run > >>> with > >>> > > every PR, but I think running them regularly makes perfect sense. > >>> > > > >>> > > > >>> > > *Why now? Why this seems to be a good time to do it?* > >>> > > - We switched to pytests and we already have separation to > >>> > > unit/integration tests in place - we can add support to system > tests > >>> using > >>> > > the same mechanisms. > >>> > > - With AIP-21 we grouped the tests into "providers" package and > that > >>> > > makes it easy to define boundaries of "systems" - every provider > is a > >>> > > "system" to test. > >>> > > - We have plenty of system tests implemented for GCP which we are > >>> going > >>> > > to use to run tests for backported packages from AIP-21 - we > followed > >>> > > system test automation for more than a year in GCP operators and we > >>> have it > >>> > > fully automated already. > >>> > > - In the latest PR - https://github.com/apache/airflow/pull/7389 > we > >>> even > >>> > > extracted all the GCP-specific way we run system tests in the way > to > >>> a) > >>> > > make it easy for everyone to write automated system tests b) make > it > >>> > > possible to be automated. > >>> > > - We have credits provided by Google to run our tests and we can > use > >>> > > them for regular runs of the system tests > >>> > > - We are close to switch-over to GitHub Actions, which will make it > >>> easy > >>> > > to write manually or regularly scheduled actions that will have > >>> securely > >>> > > stored credentials to run the system tests - in a way that it will > be > >>> > > controlled by committers and not abusable by contributors who > >>> prepare PRs. > >>> > > - I would like to start and lead a community-driven effort where we > >>> will > >>> > > split amongst community members writing missing tests - so that our > >>> new > >>> > > backport packages can be tested against latest-released version of > >>> 1.10.*. > >>> > > We will provide GCP tests as examples, we will also setup the > >>> automation > >>> > > needed to run the tests regularly - the only thing we will ask the > >>> members > >>> > > of the community is to write missing tests. This way I hope we can > >>> get very > >>> > > high coverage of backported packages. > >>> > > > >>> > > There are of course still a number of open questions - like how to > >>> store > >>> > > credentials, how often to run the tests etc. but I think those are > >>> > > implementation details that we can work out while we are > >>> implementing it. > >>> > > > >>> > > What do you think about it? If I have a lot of "yes's" quickly, I > >>> would > >>> > > love to start voting on AIP-4. > >>> > > > >>> > > J. > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > -- > >>> > > Jarek Potiuk > >>> > > Polidea <https://www.polidea.com/> | Principal Software Engineer > >>> > > > >>> > > M: +48 660 796 129 <+48660796129> > >>> > > [image: Polidea] <https://www.polidea.com/> > >>> > > >>> > > >>> > >>> > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> > > > > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > -- Michał Słowikowski Polidea <https://www.polidea.com/> | Junior Software Engineer E: michal.slowikow...@polidea.com Unique Tech Check out our projects! <https://www.polidea.com/our-work>