+1 to generalizing IT. I think the tests you mentioned were developed earlier than the general idea of how the IOIT should look like emerged. AFAIK the same goes for the tests in io/google-cloud-platform module. I recently created some issues that address that [1], [2], [3]. If there's anyone willing to take those - feel free (I can help with this).
[1] https://issues.apache.org/jira/browse/BEAM-4416 [2] https://issues.apache.org/jira/browse/BEAM-4399 [3] https://issues.apache.org/jira/browse/BEAM-4398 2018-05-30 14:00 GMT+02:00 Etienne Chauchot <echauc...@apache.org>: > Hi Łukasz > > Thanks for the details. > > I was more thinking about generalizing IT test integration. For example > some IOs like Cassandra and Elasticsearch have IT but no groovy scripts. > Also I agree with your list > And thanks for the details about backend services automatic provisioning, > I did not know that. > > Etienne > Le mercredi 30 mai 2018 à 11:21 +0200, Łukasz Gajowy a écrit : > > Hi Etienne, > > it is already possible, provided that there is appropriate Jenkins job > defined (see examples here: [1],[2]). Either the reviewer or the author can > run the seed job to load job definitions (by typing "Run seed job" in > comment) and then run the test he/she is interested to run (by specifying > the correct phrase in the GitHub comment, eg. "Run Java JdbcIO Performance > Test". The results are then available on Jenkins so those are public too. > > Regarding the infrastructure: currently, if a test requires any > Kubernetes' infrastructure, it is set up by PerfKitBenchmarker tool before > the test is actually run. After the test execution, all the infrastructure > is torn down. This also is made automatically provided that all necessary > Kubernetes' scripts are there. > > Despite the fact that it is possible, I must say that all the "Performance > Testing Framework" needs improvement in the following areas (so should be > considered as an ongoing work in progress): > - documentation and instructions for the community (this is getting more > urgent!) > - support for other runners (currently only direct and Dataflow are > supported, as there were some issues when we tried to integrate it with > Spark and Flink) > - support for other filesystems (currently only local and HDFS are > supported) > - rename and reorganize IT jobs in Jenkins (see: [3]) > > Also, I think it's worthy to look improvement in terms of job definitions > (seed jobs overwrite all jobs so this can collide with other developers > work). See the thread I started a while ago in [4] for further info. > > Best regards, > Łukasz Gajowy > > > [1] https://github.com/apache/beam/blob/master/.test-infra/ > jenkins/job_PerformanceTests_JDBC.groovy > [2] https://github.com/apache/beam/blob/master/.test-infra/ > jenkins/job_PerformanceTests_FileBasedIO_IT.groovy > [3] https://issues.apache.org/jira/browse/BEAM-4298 > [4] https://lists.apache.org/thread.html/b1aaea2c7eadc7ca1d1326b94a8c4c > 3a67befc0753897fd7fa4a3a4e@%3Cdev.beam.apache.org%3E > > 2018-05-30 10:14 GMT+02:00 Etienne Chauchot <echauc...@apache.org>: > > Hi guys > Part of the CI improvement work, I would suggest to enable running the > integration tests of the IOs from the github PR. > > Indeed, when doing a review, either the reviewer or the author needs to > run the IT. The problem is that the results are private. It would be good > to be able to run IT using a phrase in github (like the validates runner > tests) to have the results public like any other test in the PR. > But it would require the backend IT infrastructures (kubernates/docker > ...) to be always up and also to set their credentials/location in the > related jenkins groovy script. > > I opened: > https://issues.apache.org/jira/browse/BEAM-4427 > > Thoughts? > > Best > Etienne > > >