> On 27 Mar 2018, at 19:32, Vincent Massol <vinc...@massol.net> wrote:
> 
> FYI I’ve implemented it locally for all modules of xwiki-commons and did some 
> build time measurements:
> 
> * With pitest/descartes: 37:16 minutes
> * Without pitest/descartes 5:10 minutes

Actually I was able to reduce the time to 15:12 minutes with configuring pitest 
with 4 threads.

Thanks
-Vincent

> 
> So that’s a pretty important hit….
> 
> So I think one strategy could be to not run pitest/descartes by default in 
> the quality profile (i.e. have it off by default with 
> <xwiki.pitest.skip>true</xwiki.pitest.skip>) and run it on the CI, from time 
> to time, like once per day for example, or once per week.
> 
> Small issue: I need to find/test a way to run a crontab type of job in a 
> Jenkins pipeline script. I know how to do in theory but I need to test it and 
> verify it works. I still have some doubts ATM...
> 
> WDYT?
> 
> Thanks
> -Vincent
> 
>> On 15 Mar 2018, at 09:30, Vincent Massol <vinc...@massol.net> wrote:
>> 
>> Hi devs,
>> 
>> As part of the STAMP research project, we’ve developed a new tool 
>> (Descartes, based on Pitest) to measure the quality of tests. It generates a 
>> mutation score for your tests, defining how good the tests are. Technical 
>> Descartes performs some extreme mutations on the code under test (e.g. 
>> remove content of void methods, return true for methods returning a boolean, 
>> etc - See https://github.com/STAMP-project/pitest-descartes). If the test 
>> continues to pass then it means it’s not killing the mutant and thus its 
>> mutation score decreases.
>> 
>> So in short:
>> * Jacoco/Clover: measure how much of the code is tested
>> * Pitest/Descartes: measure how good the tests are
>> 
>> Both provide a percentage value.
>> 
>> I’m proposing to compute the current mutation scores for xwiki-commons and 
>> xwiki-rendering and fail the build when new code is added that reduce the 
>> mutation score threshold (exactly the same as our jacoco threshold and 
>> strategy).
>> 
>> I consider this is an experiment to push the limit of software engineering a 
>> bit further. I don’t know how well it’ll work or not. I propose to do the 
>> work and test this for over 2-3 months and see how well it works or not. At 
>> that time we can then decide whether it works or not (i.e whether the gains 
>> it brings are more important than the problems it causes).
>> 
>> Here’s my +1 to try this out.
>> 
>> Some links:
>> * pitest: http://pitest.org/
>> * descartes: https://github.com/STAMP-project/pitest-descartes
>> * http://massol.myxwiki.org/xwiki/bin/view/Blog/ControllingTestQuality
>> * http://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
>> 
>> If you’re curious, you can see a screenshot of a mutation score report at 
>> http://massol.myxwiki.org/xwiki/bin/download/Blog/MutationTestingDescartes/report.png
>> 
>> Please cast your votes.
>> 
>> Thanks
>> -Vincent
> 

Reply via email to