Any more comments for system tests? I would love to vote on the AIP-4 and my current proposal would be :
1) Let's try to automate system test execution (starting with GCP as it is close to be ready). That would most likely be with Github Actions - details to be worked on. 2) We can do it to automate testing of Backport operators (which complete AIP-21) 3) We can build it in the way that other provider's tests can be executed automatically as well, providing that there is a contribution with system tests. WDYT ? J. On Sat, Feb 15, 2020 at 8:59 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > > > On Sat, Feb 15, 2020, 20:54 Ash Berlin-Taylor <a...@apache.org> wrote: > >> Yeah, I'm for this. >> >> In fact I'm about to mark some of the Hive ones as system tests as they >> require a running hive cluster. >> I would be careful about which automatically marking unit tests that run >> dags as system/integration though, a number of our unit tests rely on this >> to test the tasks in various states in the parts of the scheduler. Ideally >> they wouldn't, but right now they do, and the tasks they run are of the >> DummyOperator or "bash_command=date" flavour. >> > > Agree with Ash here, I think for this we should have yet another category > 'dag tests' 'core tests' ? - those are indeed run using dags run with the > whole Airflow underneath but their purpose I to yes Airvlow Cor not > external systems > > > j. > > > -ash >> On Feb 15 2020, at 7:28 pm, Tomasz Urbaszek <turbas...@apache.org> wrote: >> > +1 for introducing system tests. Lack of them is a big pain. >> > >> > I would like also to suggest to mark some actual tests (those running >> > DAGs, etc) as system tests. Then we can simplify our units and >> > probably speed up CI builds (not to mention the reduction of side >> > effects). The approach used for GCP system tests that runs an example >> > DAG makes creating such tests really easy (or we can generate them >> > automatically...).I > > > >> > Regarding the frequency of such tests, I think we should run all of >> > them daily on master. Or run them when there is a change in specific >> > files (operators / hooks etc). >> > >> > Tomek >> > >> > On Sat, Feb 15, 2020 at 1:15 PM Jarek Potiuk <jarek.pot...@polidea.com> >> wrote: >> > > >> > > TL;DR; I would like to revive a discussion (hopefully short :) and >> possibly >> > > cast a vote on "AIP-4 - Support for System Tests for external >> systems". >> > > >> > > >> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems >> > > This is the very first AIP I created almost 1.5 years ago and it took >> very >> > > long to get to the point where I think we are very, very close to >> being >> > > able to implement it after many, many baby steps (and some bigger >> leaps) >> > > that we've done in the meantime. >> > > >> > > *Let me just quickly summarise what is the context:* >> > > - One of the biggest Airflow advantages are integrations with external >> > > systems. We have i think several 100s of hooks and operators working >> with >> > > those external systems >> > > - We have an extensive set of tests - both unit and integrations that >> > > are sometimes really good and catching a lot of problems, but they >> can only >> > > do as much as mocking out access to the external systems. >> > > Unit/integration tests are great for testing the core of Airflow and >> it's >> > > functionality but the external services cannot be effectively tested >> > > - The externa services sometimes change - we have new versions of >> tools, >> > > services etc released every day and sometimes even if we perfectly >> mock it >> > > in unit tests - the hooks simply stop working at some point in time. >> > > - I think there is a need to run some tests on a systems level >> regularly >> > > - communicating with "real" external systems and testing our >> operators, >> > > Let's call them System Tests. They do not necessarily need to be run >> with >> > > every PR, but I think running them regularly makes perfect sense. >> > > >> > > >> > > *Why now? Why this seems to be a good time to do it?* >> > > - We switched to pytests and we already have separation to >> > > unit/integration tests in place - we can add support to system tests >> using >> > > the same mechanisms. >> > > - With AIP-21 we grouped the tests into "providers" package and that >> > > makes it easy to define boundaries of "systems" - every provider is a >> > > "system" to test. >> > > - We have plenty of system tests implemented for GCP which we are >> going >> > > to use to run tests for backported packages from AIP-21 - we followed >> > > system test automation for more than a year in GCP operators and we >> have it >> > > fully automated already. >> > > - In the latest PR - https://github.com/apache/airflow/pull/7389 we >> even >> > > extracted all the GCP-specific way we run system tests in the way to >> a) >> > > make it easy for everyone to write automated system tests b) make it >> > > possible to be automated. >> > > - We have credits provided by Google to run our tests and we can use >> > > them for regular runs of the system tests >> > > - We are close to switch-over to GitHub Actions, which will make it >> easy >> > > to write manually or regularly scheduled actions that will have >> securely >> > > stored credentials to run the system tests - in a way that it will be >> > > controlled by committers and not abusable by contributors who prepare >> PRs. >> > > - I would like to start and lead a community-driven effort where we >> will >> > > split amongst community members writing missing tests - so that our >> new >> > > backport packages can be tested against latest-released version of >> 1.10.*. >> > > We will provide GCP tests as examples, we will also setup the >> automation >> > > needed to run the tests regularly - the only thing we will ask the >> members >> > > of the community is to write missing tests. This way I hope we can >> get very >> > > high coverage of backported packages. >> > > >> > > There are of course still a number of open questions - like how to >> store >> > > credentials, how often to run the tests etc. but I think those are >> > > implementation details that we can work out while we are implementing >> it. >> > > >> > > What do you think about it? If I have a lot of "yes's" quickly, I >> would >> > > love to start voting on AIP-4. >> > > >> > > J. >> > > >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > Jarek Potiuk >> > > Polidea <https://www.polidea.com/> | Principal Software Engineer >> > > >> > > M: +48 660 796 129 <+48660796129> >> > > [image: Polidea] <https://www.polidea.com/> >> > >> > >> >> -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>