Indeed we discussed this but I don't see it in your mail Vincent.

On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah <adel.atal...@xwiki.com> wrote:
> Hello,
>
> Maybe we should agree on having a whole day dedicated on using these
> tools with a maximum number of developers.
> That way we will be able to help each other and maybe it will make the
> process easier to carry out in the future.
>
> WDYT?
>
> Thanks,
> Adel
>
>
> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol <vinc...@massol.net> wrote:
>> Hi devs (and anyone else interested to improve the tests of XWiki),
>>
>> History
>> ======
>>
>> It all started when I analyzed our global TPC and found that it was going 
>> down globally even though we have the fail-build-on-jacoco-threshold 
>> strategy.
>>
>> I sent several email threads:
>>
>> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
>> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
>> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>>
>> Note: As a consequence of this last thread, I implemented a Jenkins Pipeline 
>> to send us a mail when the global TPC of an XWiki module goes down so that 
>> we fix it ASAP. This is still a development in progress. A first version is 
>> done and running at https://ci.xwiki.org/view/Tools/job/Clover/ but I need 
>> to debug it and fix it (it’s not working ATM).
>>
>> As a result of the global TPC going down/stagnating, I have proposed to have 
>> 10.7 focused on Tests + BFD.
>> - Initially I proposed to focus on increasing the global TPC by looking at 
>> the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). 
>> See the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we 
>> need to fix the red parts).
>> - Then with the STAMP mid-term review, a bigger urgency surfaced and I asked 
>> if we could instead focus on fixing tests as reported by Descartes to 
>> increase both coverage and mutation score (ie test quality), since those are 
>> 2 metrics/KPIs measured by STAMP and since XWiki participates to STAMP we 
>> need to work on them and increase them substantially. See 
>> http://markmail.org/message/ejmdkf3hx7drkj52
>>
>> The results of XWiki 10.7 has been quite poor on test improvements  (more 
>> focus on BFD than tests, lots of devs on holidays, etc). This forces us to 
>> have a different strategy.
>>
>> Full Strategy proposal
>> =================
>>
>> 1) As many XWiki SAS devs as possible (and anyone else from the community 
>> who’s interested ofc! :)) should spend 1 day per week working on improving 
>> STAMP metrics
>> * Currently the agreement is that Thomas and myself will do this for the 
>> foreseeable future till we get some good-enough metric progress
>> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
>> (Marius, Adel if he can, Simon in the future). The idea is to see where that 
>> could get us by using substantial manpower.
>>
>> 2) All committers: More generally the global TPC failure is also already 
>> active and dev need to modify modules that see their global TPC go down.
>>
>> 3) All committers: Of course, the jacoco strategy is also active at each 
>> module level.
>>
>> STAMP tools
>> ==========
>>
>> There are 4 tools developed by STAMP:
>> * Descartes: Improves quality of tests by increasing their mutation scores. 
>> See http://markmail.org/message/bonb5f7f37omnnog and also 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
>> * DSpot: Automatically generate new tests, based on existing tests. See 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot
>> * CAMP: Takes a Dockerfile and generates mutations of it, then deploys and 
>> execute tests on the software to see if the mutation works or not. Note this 
>> is currently not fitting the need of XWiki and thus I’ve been developing 
>> another tool as an experiment (which may go back in CAMP one day), based on 
>> TestContainers, see 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/EnvironmentTestingExperimentations
>> * EvoCrash: Takes a stack trace from production logs and generates a test 
>> that, when executed, reproduces the crash. See 
>> https://markmail.org/message/v74g3tsmflquqwra. See also 
>> https://github.com/SERG-Delft/EvoCrash
>>
>> Since XWiki is part of the STAMP research project, we need to use those 4 
>> tools to increase the KPIs associated with the tools. See below.
>>
>> Objectives/KPIs/Metrics for STAMP
>> ===========================
>>
>> The STAMP project has defined 9 KPIs that all partners (and thus XWiki) need 
>> to work on:
>>
>> 1) K01: Increase test coverage
>> * Global increase by reducing by 40% the non-covered code. For XWiki since 
>> we’re at about 70%, this means reaching about 80% before the end of STAMP 
>> (ie. before end of 2019)
>> * Increase the coverage contributions of each tool developed by STAMP.
>>
>> Strategy:
>> * Primary goal:
>> ** Increase coverage by executing Descartes and improving our tests. This is 
>> http://markmail.org/message/ejmdkf3hx7drkj52
>> ** Don’t do anything with DSpot. I’ll do that part. Note that the goal is to 
>> write a Jenkins pipeline to automatically execute DSpot from time to time 
>> and commit the generated tests in a separate test source and have our build 
>> execute both src/test/java and this new test source.
>> ** Don’t do anything with TestContainers FTM since I need to finish a first 
>> working version. I may need help in the future to implement docker images 
>> for more configurations (on Oracle, in a cluster, with LibreOffice, with an 
>> external SOLR server, etc).
>> ** For EvoCrash: We’ll count contributions of EvoCrash to coverage in K08.
>> * Secondary goal:
>> ** Increase our global TPC as mentioned above by fixing the modules in red.
>>
>> 2) K02: Reduce flaky tests.
>> * Objective: reduce the number of flaky tests by 20%
>>
>> Strategy:
>> * Record flaky tests in jira
>> * Fix the max number of them
>>
>> 3) K03: Better test quality
>> * Objective: increase mutation score by 20%
>>
>> Strategy:
>> * Same strategy as K01.
>>
>> 4) K04: More configuration-related paths tested
>> * Objective: increase the code coverage of configuration-related paths in 
>> our code by 20% (e.g. DB schema creation, cluster)related code, SOLR-related 
>> code, LibreOffice-related code, etc).
>>
>> Strategy:
>> * Leave it to FTM. The idea is to measure Clover TPC with the base 
>> configuration, then execute all other configurations (with TestContainers) 
>> and regenerate the Clover report to see how much the TPC has increased.
>>
>> 5) K05: Reduce system-specific bugs
>> * Objective: 30% improvement
>>
>> Strategy:
>> * Run TestContainers, execute existing tests and find new bugs related to 
>> configurations. Record them
>>
>> 6) K06: More configurations/Faster tests
>> * Objective: increase the number of automatically tested configurations by 
>> 50%
>>
>> Strategy:
>> * Increase the # of configurations we test with TestContainers. I’ll do that 
>> part initially.
>> * Reduce time it takes to deploy the software under a given configuration vs 
>> time it used to take when done manually before STAMP. I’ll do this one. I’ve 
>> already worked on it in the past year with the dockerization of XWiki.
>>
>> 7) K07: Pending, nothing to do FTM
>>
>> 8) K08: More crash replicating test cases
>> * Objective: increase the number of crash replicating test cases by at least 
>> 70%
>>
>> Strategy:
>> * For all issues that are still open and that have stack traces and for all 
>> issues closed but without tests, run EvoCrash on them to try to generate a 
>> test.
>> * Record and count the number of successful EvoCrash-generated test cases.
>> * Derive a regression test (which can be very different from the negative of 
>> the test generated by evocrash!).
>> * Measure the new coverage increase
>> * Note that I haven’t experimented much with this yet myself.
>>
>> 9) K09: Pending, nothing to do FTM.
>>
>> Conclusion
>> =========
>>
>> Right now, I need your help for the following KPIs: K01, K02, K03, K08.
>>
>> Since there’s a lot to understand in this email, I’m open to:
>> * Organizing a meeting on youtube live to discuss all this
>> * Answering any questions on this thread ofc
>> * Also feel free to ask on IRC/Matrix.
>>
>> Here’s an extract from STAMP which has more details about the KPIs/metrics:
>> https://up1.xwikisas.com/#QJyxqspKXSzuWNOHUuAaEA
>>
>> Thanks
>> -Vincent
>>
>>
>>
>>
>>
>>



-- 
Thomas Mortagne

Reply via email to