Indeed we discussed this but I don't see it in your mail Vincent. On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah <adel.atal...@xwiki.com> wrote: > Hello, > > Maybe we should agree on having a whole day dedicated on using these > tools with a maximum number of developers. > That way we will be able to help each other and maybe it will make the > process easier to carry out in the future. > > WDYT? > > Thanks, > Adel > > > On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol <vinc...@massol.net> wrote: >> Hi devs (and anyone else interested to improve the tests of XWiki), >> >> History >> ====== >> >> It all started when I analyzed our global TPC and found that it was going >> down globally even though we have the fail-build-on-jacoco-threshold >> strategy. >> >> I sent several email threads: >> >> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6 >> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn >> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7 >> >> Note: As a consequence of this last thread, I implemented a Jenkins Pipeline >> to send us a mail when the global TPC of an XWiki module goes down so that >> we fix it ASAP. This is still a development in progress. A first version is >> done and running at https://ci.xwiki.org/view/Tools/job/Clover/ but I need >> to debug it and fix it (it’s not working ATM). >> >> As a result of the global TPC going down/stagnating, I have proposed to have >> 10.7 focused on Tests + BFD. >> - Initially I proposed to focus on increasing the global TPC by looking at >> the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). >> See the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we >> need to fix the red parts). >> - Then with the STAMP mid-term review, a bigger urgency surfaced and I asked >> if we could instead focus on fixing tests as reported by Descartes to >> increase both coverage and mutation score (ie test quality), since those are >> 2 metrics/KPIs measured by STAMP and since XWiki participates to STAMP we >> need to work on them and increase them substantially. See >> http://markmail.org/message/ejmdkf3hx7drkj52 >> >> The results of XWiki 10.7 has been quite poor on test improvements (more >> focus on BFD than tests, lots of devs on holidays, etc). This forces us to >> have a different strategy. >> >> Full Strategy proposal >> ================= >> >> 1) As many XWiki SAS devs as possible (and anyone else from the community >> who’s interested ofc! :)) should spend 1 day per week working on improving >> STAMP metrics >> * Currently the agreement is that Thomas and myself will do this for the >> foreseeable future till we get some good-enough metric progress >> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM >> (Marius, Adel if he can, Simon in the future). The idea is to see where that >> could get us by using substantial manpower. >> >> 2) All committers: More generally the global TPC failure is also already >> active and dev need to modify modules that see their global TPC go down. >> >> 3) All committers: Of course, the jacoco strategy is also active at each >> module level. >> >> STAMP tools >> ========== >> >> There are 4 tools developed by STAMP: >> * Descartes: Improves quality of tests by increasing their mutation scores. >> See http://markmail.org/message/bonb5f7f37omnnog and also >> https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes >> * DSpot: Automatically generate new tests, based on existing tests. See >> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot >> * CAMP: Takes a Dockerfile and generates mutations of it, then deploys and >> execute tests on the software to see if the mutation works or not. Note this >> is currently not fitting the need of XWiki and thus I’ve been developing >> another tool as an experiment (which may go back in CAMP one day), based on >> TestContainers, see >> https://massol.myxwiki.org/xwiki/bin/view/Blog/EnvironmentTestingExperimentations >> * EvoCrash: Takes a stack trace from production logs and generates a test >> that, when executed, reproduces the crash. See >> https://markmail.org/message/v74g3tsmflquqwra. See also >> https://github.com/SERG-Delft/EvoCrash >> >> Since XWiki is part of the STAMP research project, we need to use those 4 >> tools to increase the KPIs associated with the tools. See below. >> >> Objectives/KPIs/Metrics for STAMP >> =========================== >> >> The STAMP project has defined 9 KPIs that all partners (and thus XWiki) need >> to work on: >> >> 1) K01: Increase test coverage >> * Global increase by reducing by 40% the non-covered code. For XWiki since >> we’re at about 70%, this means reaching about 80% before the end of STAMP >> (ie. before end of 2019) >> * Increase the coverage contributions of each tool developed by STAMP. >> >> Strategy: >> * Primary goal: >> ** Increase coverage by executing Descartes and improving our tests. This is >> http://markmail.org/message/ejmdkf3hx7drkj52 >> ** Don’t do anything with DSpot. I’ll do that part. Note that the goal is to >> write a Jenkins pipeline to automatically execute DSpot from time to time >> and commit the generated tests in a separate test source and have our build >> execute both src/test/java and this new test source. >> ** Don’t do anything with TestContainers FTM since I need to finish a first >> working version. I may need help in the future to implement docker images >> for more configurations (on Oracle, in a cluster, with LibreOffice, with an >> external SOLR server, etc). >> ** For EvoCrash: We’ll count contributions of EvoCrash to coverage in K08. >> * Secondary goal: >> ** Increase our global TPC as mentioned above by fixing the modules in red. >> >> 2) K02: Reduce flaky tests. >> * Objective: reduce the number of flaky tests by 20% >> >> Strategy: >> * Record flaky tests in jira >> * Fix the max number of them >> >> 3) K03: Better test quality >> * Objective: increase mutation score by 20% >> >> Strategy: >> * Same strategy as K01. >> >> 4) K04: More configuration-related paths tested >> * Objective: increase the code coverage of configuration-related paths in >> our code by 20% (e.g. DB schema creation, cluster)related code, SOLR-related >> code, LibreOffice-related code, etc). >> >> Strategy: >> * Leave it to FTM. The idea is to measure Clover TPC with the base >> configuration, then execute all other configurations (with TestContainers) >> and regenerate the Clover report to see how much the TPC has increased. >> >> 5) K05: Reduce system-specific bugs >> * Objective: 30% improvement >> >> Strategy: >> * Run TestContainers, execute existing tests and find new bugs related to >> configurations. Record them >> >> 6) K06: More configurations/Faster tests >> * Objective: increase the number of automatically tested configurations by >> 50% >> >> Strategy: >> * Increase the # of configurations we test with TestContainers. I’ll do that >> part initially. >> * Reduce time it takes to deploy the software under a given configuration vs >> time it used to take when done manually before STAMP. I’ll do this one. I’ve >> already worked on it in the past year with the dockerization of XWiki. >> >> 7) K07: Pending, nothing to do FTM >> >> 8) K08: More crash replicating test cases >> * Objective: increase the number of crash replicating test cases by at least >> 70% >> >> Strategy: >> * For all issues that are still open and that have stack traces and for all >> issues closed but without tests, run EvoCrash on them to try to generate a >> test. >> * Record and count the number of successful EvoCrash-generated test cases. >> * Derive a regression test (which can be very different from the negative of >> the test generated by evocrash!). >> * Measure the new coverage increase >> * Note that I haven’t experimented much with this yet myself. >> >> 9) K09: Pending, nothing to do FTM. >> >> Conclusion >> ========= >> >> Right now, I need your help for the following KPIs: K01, K02, K03, K08. >> >> Since there’s a lot to understand in this email, I’m open to: >> * Organizing a meeting on youtube live to discuss all this >> * Answering any questions on this thread ofc >> * Also feel free to ask on IRC/Matrix. >> >> Here’s an extract from STAMP which has more details about the KPIs/metrics: >> https://up1.xwikisas.com/#QJyxqspKXSzuWNOHUuAaEA >> >> Thanks >> -Vincent >> >> >> >> >> >>
-- Thomas Mortagne