Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-10-17 Thread Vincent Massol



> On 17 Oct 2018, at 15:54, Vincent Massol  wrote:
> 
> Hi,
> 
>> On 17 Oct 2018, at 11:20, Vincent Massol  wrote:
>> 
>> Hi,
>> 
>> [snip]
>> 
>>> Process to run DSpot:
>>> 1) Pick a module. Measure coverage and mutation score (or take the value 
>>> there already if they’re in the pom.xml). Same as for Descartes testing.
>>> 2) Run DSpot on the module, see 
>>> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot for 
>>> explanations
>> 
>> One important detail that I had missed. We need to run Dspot with 
>> “—descartes” on the command line so that it uses Descartes for computing the 
>> mutation score for mutations and only keep tests that increase the mutation 
>> score as reported by Descartes.
> 
> So actually, after speaking with Benjamin, I’ve realized a few things:
> 
> * By default DSpot runs with the PIT selector (PitMutantScoreSelector) which 
> is configured to use the default PIT mutations. This is why we need to run 
> with the PIT selector but configured to use the Descartes mutation, and this 
> is done by specifying --descartes.
> * Now this will optimize the generation of new tests for their increased 
> mutation score. Right now we got 0% all the time on our tests (see 
> https://docs.google.com/spreadsheets/d/1LULpGpsJirmFyvHNstLGv-Gv5DVBdpLTM2hm0jgCKUw/edit#gid=2061481816)
>  and it’s because we didn’t use --descartes. We need to try again or run on 
> new modules with --descartes and see what it gives us. It’s possible it’ll 
> generate even less tests…
> * For the coverage part, there are 2 other selectors that can be used with 
> DSpot to generate tests that all increase the coverage:
> ** "--test-criterion JacocoCoverageSelector": uses jacoco and keep tests that 
> increase the instruction coverage
> ** "--test-criterion CloverCoverageSelector”: uses openclover and keep tests 
> that increase the branch coverage
> 
> So we need to test with the various selectors and see what we get. 

I’ve retested on xwiki-commons-component-default:
1) With —descartes: failure, see 
https://github.com/STAMP-project/dspot/issues/584
2) With jacoco selector: failure, see 
https://github.com/STAMP-project/dspot/issues/586. I’ve manually fixed the 
tests and remove those that didn’t pass. I got only +0.18% jacoco coverage 
increase and -2% descartes mutation score… That’s the problem, we would need a 
selector that optimizes for both. I’ve created 
https://github.com/STAMP-project/dspot/issues/587
3) With clover selector: no tests generated! Opened 
https://github.com/STAMP-project/dspot/issues/588

So my recommendation is to wait for 
https://github.com/STAMP-project/dspot/issues/584 to be fixed and then to use 
—descartes for our measures FTM.

Thanks
-Vincent

PS: Command lines used for reference:

- java -jar 
/Users/vmassol/dev/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar
 --path-to-properties dspot.properties --descartes --verbose 
--generate-new-test-class --with-comment
- java -jar 
/Users/vmassol/dev/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar
 --path-to-properties dspot.properties --test-criterion JacocoCoverageSelector 
--verbose --generate-new-test-class --with-comment
- java -jar 
/Users/vmassol/dev/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar
 --path-to-properties dspot.properties --test-criterion CloverCoverageSelector 
--verbose --generate-new-test-class --with-comment


> 
> If we want to get the best values, we should use --descartes for K03 and 
> either jacoco or clover selector for K01. Now we need to see what tests we 
> get.
> 
> Thanks
> -Vincent
> 
>> 
>>> 3) If DSpot has generated tests, add them to XWiki’s source code in 
>>> src/test/dspot and add the following to the pom of that module:
>>> 
>>> 
>>> 
>>>  
>>>  
>>>org.codehaus.mojo
>>>build-helper-maven-plugin
>>>  
>>> 
>>> 
>>> 
>>> Example: 
>>> https://github.com/xwiki/xwiki-commons/tree/244ee07976c691c335b7f54c48e6308004ba3d82/xwiki-commons-core/xwiki-commons-crypto/xwiki-commons-crypto-cipher
>>> 
>>> Note: The generated tests sometimes need to be modified a bit to pass. 
>>> Personally I’ve only committed tests that were passing and I reported 
>>> issues for those that were not passing.
>>> 
>>> 4) File the various reports:
>>> a) https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
>>> both for success and failures
>>> b) 
>>> https://docs.google.com/spreadsheets/d/1LULpGpsJirmFyvHNstLGv-Gv5DVBdpLTM2hm0jgCKUw/edit#gid=2061481816
>>> c) for failures, file a github issue at 
>>> https://github.com/STAMP-project/dspot/issues and link to the place on 
>>> https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
>>> where we put the failing result.
>>> 
>>> Note: The reason we need to report failures too is because DSpot fails a 
>>> lot so we need to show what we have tested
>>> 
>>> Thanks
>>> -Vincent
>>> 
>> 
>> [snip]
>> 
>> Thanks
>> -Vincent



Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-10-17 Thread Vincent Massol
Hi,

> On 17 Oct 2018, at 11:20, Vincent Massol  wrote:
> 
> Hi,
> 
> [snip]
> 
>> Process to run DSpot:
>> 1) Pick a module. Measure coverage and mutation score (or take the value 
>> there already if they’re in the pom.xml). Same as for Descartes testing.
>> 2) Run DSpot on the module, see 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot for 
>> explanations
> 
> One important detail that I had missed. We need to run Dspot with 
> “—descartes” on the command line so that it uses Descartes for computing the 
> mutation score for mutations and only keep tests that increase the mutation 
> score as reported by Descartes.

So actually, after speaking with Benjamin, I’ve realized a few things:

* By default DSpot runs with the PIT selector (PitMutantScoreSelector) which is 
configured to use the default PIT mutations. This is why we need to run with 
the PIT selector but configured to use the Descartes mutation, and this is done 
by specifying --descartes.
* Now this will optimize the generation of new tests for their increased 
mutation score. Right now we got 0% all the time on our tests (see 
https://docs.google.com/spreadsheets/d/1LULpGpsJirmFyvHNstLGv-Gv5DVBdpLTM2hm0jgCKUw/edit#gid=2061481816)
 and it’s because we didn’t use --descartes. We need to try again or run on new 
modules with --descartes and see what it gives us. It’s possible it’ll generate 
even less tests…
* For the coverage part, there are 2 other selectors that can be used with 
DSpot to generate tests that all increase the coverage:
** "--test-criterion JacocoCoverageSelector": uses jacoco and keep tests that 
increase the instruction coverage
** "--test-criterion CloverCoverageSelector”: uses openclover and keep tests 
that increase the branch coverage

So we need to test with the various selectors and see what we get. 

If we want to get the best values, we should use --descartes for K03 and either 
jacoco or clover selector for K01. Now we need to see what tests we get.

Thanks
-Vincent

> 
>> 3) If DSpot has generated tests, add them to XWiki’s source code in 
>> src/test/dspot and add the following to the pom of that module:
>> 
>> 
>> 
>>   
>>   
>> org.codehaus.mojo
>> build-helper-maven-plugin
>>   
>> 
>> 
>> 
>> Example: 
>> https://github.com/xwiki/xwiki-commons/tree/244ee07976c691c335b7f54c48e6308004ba3d82/xwiki-commons-core/xwiki-commons-crypto/xwiki-commons-crypto-cipher
>> 
>> Note: The generated tests sometimes need to be modified a bit to pass. 
>> Personally I’ve only committed tests that were passing and I reported issues 
>> for those that were not passing.
>> 
>> 4) File the various reports:
>> a) https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
>> both for success and failures
>> b) 
>> https://docs.google.com/spreadsheets/d/1LULpGpsJirmFyvHNstLGv-Gv5DVBdpLTM2hm0jgCKUw/edit#gid=2061481816
>> c) for failures, file a github issue at 
>> https://github.com/STAMP-project/dspot/issues and link to the place on 
>> https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
>> where we put the failing result.
>> 
>> Note: The reason we need to report failures too is because DSpot fails a lot 
>> so we need to show what we have tested
>> 
>> Thanks
>> -Vincent
>> 
> 
> [snip]
> 
> Thanks
> -Vincent
> 
> 



Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-10-17 Thread Vincent Massol
Hi,

[snip]

> Process to run DSpot:
> 1) Pick a module. Measure coverage and mutation score (or take the value 
> there already if they’re in the pom.xml). Same as for Descartes testing.
> 2) Run DSpot on the module, see 
> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot for 
> explanations

One important detail that I had missed. We need to run Dspot with “—descartes” 
on the command line so that it uses Descartes for computing the mutation score 
for mutations and only keep tests that increase the mutation score as reported 
by Descartes.

> 3) If DSpot has generated tests, add them to XWiki’s source code in 
> src/test/dspot and add the following to the pom of that module:
> 
> 
>  
>
>
>  org.codehaus.mojo
>  build-helper-maven-plugin
>
>  
> 
> 
> Example: 
> https://github.com/xwiki/xwiki-commons/tree/244ee07976c691c335b7f54c48e6308004ba3d82/xwiki-commons-core/xwiki-commons-crypto/xwiki-commons-crypto-cipher
> 
> Note: The generated tests sometimes need to be modified a bit to pass. 
> Personally I’ve only committed tests that were passing and I reported issues 
> for those that were not passing.
> 
> 4) File the various reports:
> a) https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
> both for success and failures
> b) 
> https://docs.google.com/spreadsheets/d/1LULpGpsJirmFyvHNstLGv-Gv5DVBdpLTM2hm0jgCKUw/edit#gid=2061481816
> c) for failures, file a github issue at 
> https://github.com/STAMP-project/dspot/issues and link to the place on 
> https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
> where we put the failing result.
> 
> Note: The reason we need to report failures too is because DSpot fails a lot 
> so we need to show what we have tested
> 
> Thanks
> -Vincent
> 

[snip]

Thanks
-Vincent




Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-10-16 Thread Vincent Massol
Hi there,

We need some more DSpot results. Would be great if you could help out.

See below for instructions.

> On 29 Aug 2018, at 11:20, Vincent Massol  wrote:
> 
> Hi devs (and anyone else interested to improve the tests of XWiki),
> 
> History
> ==
> 
> It all started when I analyzed our global TPC and found that it was going 
> down globally even though we have the fail-build-on-jacoco-threshold strategy.
> 
> I sent several email threads:
> 
> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
> 
> Note: As a consequence of this last thread, I implemented a Jenkins Pipeline 
> to send us a mail when the global TPC of an XWiki module goes down so that we 
> fix it ASAP. This is still a development in progress. A first version is done 
> and running at https://ci.xwiki.org/view/Tools/job/Clover/ but I need to 
> debug it and fix it (it’s not working ATM).
> 
> As a result of the global TPC going down/stagnating, I have proposed to have 
> 10.7 focused on Tests + BFD.
> - Initially I proposed to focus on increasing the global TPC by looking at 
> the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). See 
> the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need 
> to fix the red parts).
> - Then with the STAMP mid-term review, a bigger urgency surfaced and I asked 
> if we could instead focus on fixing tests as reported by Descartes to 
> increase both coverage and mutation score (ie test quality), since those are 
> 2 metrics/KPIs measured by STAMP and since XWiki participates to STAMP we 
> need to work on them and increase them substantially. See 
> http://markmail.org/message/ejmdkf3hx7drkj52
> 
> The results of XWiki 10.7 has been quite poor on test improvements  (more 
> focus on BFD than tests, lots of devs on holidays, etc). This forces us to 
> have a different strategy.
> 
> Full Strategy proposal
> =
> 
> 1) As many XWiki SAS devs as possible (and anyone else from the community 
> who’s interested ofc! :)) should spend 1 day per week working on improving 
> STAMP metrics
> * Currently the agreement is that Thomas and myself will do this for the 
> foreseeable future till we get some good-enough metric progress
> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
> (Marius, Adel if he can, Simon in the future). The idea is to see where that 
> could get us by using substantial manpower.
> 
> 2) All committers: More generally the global TPC failure is also already 
> active and dev need to modify modules that see their global TPC go down.
> 
> 3) All committers: Of course, the jacoco strategy is also active at each 
> module level.
> 
> STAMP tools
> ==
> 
> There are 4 tools developed by STAMP:
> * Descartes: Improves quality of tests by increasing their mutation scores. 
> See http://markmail.org/message/bonb5f7f37omnnog and also 
> https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
> * DSpot: Automatically generate new tests, based on existing tests. See 
> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot

Process to run DSpot:
1) Pick a module. Measure coverage and mutation score (or take the value there 
already if they’re in the pom.xml). Same as for Descartes testing.
2) Run DSpot on the module, see 
https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot for 
explanations
3) If DSpot has generated tests, add them to XWiki’s source code in 
src/test/dspot and add the following to the pom of that module:


  


  org.codehaus.mojo
  build-helper-maven-plugin

  


Example: 
https://github.com/xwiki/xwiki-commons/tree/244ee07976c691c335b7f54c48e6308004ba3d82/xwiki-commons-core/xwiki-commons-crypto/xwiki-commons-crypto-cipher

Note: The generated tests sometimes need to be modified a bit to pass. 
Personally I’ve only committed tests that were passing and I reported issues 
for those that were not passing.

4) File the various reports:
a) https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki 
both for success and failures
b) 
https://docs.google.com/spreadsheets/d/1LULpGpsJirmFyvHNstLGv-Gv5DVBdpLTM2hm0jgCKUw/edit#gid=2061481816
c) for failures, file a github issue at 
https://github.com/STAMP-project/dspot/issues and link to the place on 
https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki where 
we put the failing result.

Note: The reason we need to report failures too is because DSpot fails a lot so 
we need to show what we have tested

Thanks
-Vincent

> * CAMP: Takes a Dockerfile and generates mutations of it, then deploys and 
> execute tests on the software to see if the mutation works or not. Note this 
> is currently not fitting the need of XWiki and thus I’ve been developing 
> another tool as an experiment (which may go back in CAMP one 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-04 Thread Vincent Massol



> On 29 Aug 2018, at 11:20, Vincent Massol  wrote:

[snip]

> Objectives/KPIs/Metrics for STAMP
> ===
> 
> The STAMP project has defined 9 KPIs that all partners (and thus XWiki) need 
> to work on:
> 
> 1) K01: Increase test coverage
> * Global increase by reducing by 40% the non-covered code. For XWiki since 
> we’re at about 70%, this means reaching about 80% before the end of STAMP 
> (ie. before end of 2019)
> * Increase the coverage contributions of each tool developed by STAMP.
> 
> Strategy:
> * Primary goal: 
> ** Increase coverage by executing Descartes and improving our tests. This is 
> http://markmail.org/message/ejmdkf3hx7drkj52
> ** Don’t do anything with DSpot. I’ll do that part. Note that the goal is to 
> write a Jenkins pipeline to automatically execute DSpot from time to time and 
> commit the generated tests in a separate test source and have our build 
> execute both src/test/java and this new test source.

Contrary to what was proposed initially, it would be nice to run DSpot too. 

FTR a good command line to use for DSpot is:
java -jar /dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar 
--path-to-properties dspot.properties --verbose --generate-new-test-class 
--with-comment

The --generate-new-test-class tells DSpot to generate in its output dir only 
the new tests added and not include existing tests.
The --with-comment tells DSpot to keep the comments and thus the license header 
too

I did a session today and committed the results in 
https://github.com/STAMP-project/dspot-usecases-output/commit/113726c0aac3af3df30334d14115d89227eaebdc

What I did:
* For each module tested with DSpot create a folder in 
https://github.com/STAMP-project/dspot-usecases-output/tree/master/xwiki
* For cases where DSpot could generate some tests, commit them and modify the 
pom.xml so that they are executed
* Note: tests need to have their license headers adjusted so that they don’t 
fail the build
* Computed coverage + mutation scores before and after and reported in the 
README.md in each folder

Thanks
-Vincent

> ** Don’t do anything with TestContainers FTM since I need to finish a first 
> working version. I may need help in the future to implement docker images for 
> more configurations (on Oracle, in a cluster, with LibreOffice, with an 
> external SOLR server, etc).
> ** For EvoCrash: We’ll count contributions of EvoCrash to coverage in K08.
> * Secondary goal:
> ** Increase our global TPC as mentioned above by fixing the modules in red.
> 
> 2) K02: Reduce flaky tests.
> * Objective: reduce the number of flaky tests by 20%
> 
> Strategy:
> * Record flaky tests in jira
> * Fix the max number of them
> 
> 3) K03: Better test quality
> * Objective: increase mutation score by 20%
> 
> Strategy:
> * Same strategy as K01.

[snip]

Thanks
-Vincent



Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-04 Thread Vincent Massol
So we had a conf call this morning and we agreed to have TFD (Test Fixing Day) 
on Tuesdays for the XWiki 10.8 timeframe. Those who cannot attend on Tuesday 
will work on the tests during the other days to catch up.

This means starting today! :)

Thanks
-Vincent

> On 30 Aug 2018, at 12:27, Adel Atallah  wrote:
> 
> Just to be clear, when I proposed "having a whole day dedicated on
> using these tools", I didn't meant having to have it every week but
> only once, so we can properly start improving the tests. It would be
> some kind of training.
> On my side I don't think I'll be able to have on a week one day
> dedicated to tests and one for bug fixing, I won't have time left for
> the roadmap as I will only work on the product 50% of the time.
> 
> 
> On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:
>> Hi,
>> 
>> I don’t remember discussing this with you Thomas. Actually I’m not convinced 
>> to have a fixed day:
>> * we already have a fixed BFD and having a second one doesn’t leave much 
>> flexibility for working on roadmap items when it’s the best
>> * test sessions can be short (0.5-1 hours) and it’s easy to do them between 
>> other tasks
>> * it can be boring to spend a full day on them
>> 
>> Now, I agree that not having a fixed day will make it hard to make sure that 
>> we work 20% on that topic.
>> 
>> So if you prefer we can define a day, knowing that some won’t be able to 
>> always attend during that day and in this case they should do it on another 
>> day. What’s important is to have 20% done each week (i.e. enough work done 
>> on it).
>> 
>> In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
>> logical to me.
>> 
>> WDYT? What do you prefer?
>> 
>> Thanks
>> -Vincent
>> 
>>> On 30 Aug 2018, at 10:38, Thomas Mortagne  wrote:
>>> 
>>> Indeed we discussed this but I don't see it in your mail Vincent.
>>> 
>>> On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  
>>> wrote:
 Hello,
 
 Maybe we should agree on having a whole day dedicated on using these
 tools with a maximum number of developers.
 That way we will be able to help each other and maybe it will make the
 process easier to carry out in the future.
 
 WDYT?
 
 Thanks,
 Adel
 
 
 On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  
 wrote:
> Hi devs (and anyone else interested to improve the tests of XWiki),
> 
> History
> ==
> 
> It all started when I analyzed our global TPC and found that it was going 
> down globally even though we have the fail-build-on-jacoco-threshold 
> strategy.
> 
> I sent several email threads:
> 
> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
> 
> Note: As a consequence of this last thread, I implemented a Jenkins 
> Pipeline to send us a mail when the global TPC of an XWiki module goes 
> down so that we fix it ASAP. This is still a development in progress. A 
> first version is done and running at 
> https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it and 
> fix it (it’s not working ATM).
> 
> As a result of the global TPC going down/stagnating, I have proposed to 
> have 10.7 focused on Tests + BFD.
> - Initially I proposed to focus on increasing the global TPC by looking 
> at the reports from 1) above 
> (http://markmail.org/message/qjemnip7hjva2rjd). See the last report at 
> https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need to fix the red 
> parts).
> - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
> asked if we could instead focus on fixing tests as reported by Descartes 
> to increase both coverage and mutation score (ie test quality), since 
> those are 2 metrics/KPIs measured by STAMP and since XWiki participates 
> to STAMP we need to work on them and increase them substantially. See 
> http://markmail.org/message/ejmdkf3hx7drkj52
> 
> The results of XWiki 10.7 has been quite poor on test improvements  (more 
> focus on BFD than tests, lots of devs on holidays, etc). This forces us 
> to have a different strategy.
> 
> Full Strategy proposal
> =
> 
> 1) As many XWiki SAS devs as possible (and anyone else from the community 
> who’s interested ofc! :)) should spend 1 day per week working on 
> improving STAMP metrics
> * Currently the agreement is that Thomas and myself will do this for the 
> foreseeable future till we get some good-enough metric progress
> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
> (Marius, Adel if he can, Simon in the future). The idea is to see where 
> that could get us by using substantial manpower.
> 
> 2) All committers: More 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-03 Thread Simon Urli

OK for me too.

Simon

On 9/3/18 10:31 AM, Thomas Mortagne wrote:

Sounds good.

On Mon, Sep 3, 2018 at 9:55 AM, Vincent Massol  wrote:



On 3 Sep 2018, at 09:55, Vincent Massol  wrote:

I propose to do this tomorrow Tuesday, starting with an intro from me, using 
youtube live.


Say, 10AM Paris time.

Thanks
-Vincent


WDYT?

Thanks
-Vincent


On 30 Aug 2018, at 12:27, Adel Atallah  wrote:

Just to be clear, when I proposed "having a whole day dedicated on
using these tools", I didn't meant having to have it every week but
only once, so we can properly start improving the tests. It would be
some kind of training.
On my side I don't think I'll be able to have on a week one day
dedicated to tests and one for bug fixing, I won't have time left for
the roadmap as I will only work on the product 50% of the time.


On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:

Hi,

I don’t remember discussing this with you Thomas. Actually I’m not convinced to 
have a fixed day:
* we already have a fixed BFD and having a second one doesn’t leave much 
flexibility for working on roadmap items when it’s the best
* test sessions can be short (0.5-1 hours) and it’s easy to do them between 
other tasks
* it can be boring to spend a full day on them

Now, I agree that not having a fixed day will make it hard to make sure that we 
work 20% on that topic.

So if you prefer we can define a day, knowing that some won’t be able to always 
attend during that day and in this case they should do it on another day. 
What’s important is to have 20% done each week (i.e. enough work done on it).

In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
logical to me.

WDYT? What do you prefer?

Thanks
-Vincent


On 30 Aug 2018, at 10:38, Thomas Mortagne  wrote:

Indeed we discussed this but I don't see it in your mail Vincent.

On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  wrote:

Hello,

Maybe we should agree on having a whole day dedicated on using these
tools with a maximum number of developers.
That way we will be able to help each other and maybe it will make the
process easier to carry out in the future.

WDYT?

Thanks,
Adel


On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  wrote:

Hi devs (and anyone else interested to improve the tests of XWiki),

History
==

It all started when I analyzed our global TPC and found that it was going down 
globally even though we have the fail-build-on-jacoco-threshold strategy.

I sent several email threads:

- Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
- TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
- Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7

Note: As a consequence of this last thread, I implemented a Jenkins Pipeline to 
send us a mail when the global TPC of an XWiki module goes down so that we fix 
it ASAP. This is still a development in progress. A first version is done and 
running at https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it 
and fix it (it’s not working ATM).

As a result of the global TPC going down/stagnating, I have proposed to have 
10.7 focused on Tests + BFD.
- Initially I proposed to focus on increasing the global TPC by looking at the 
reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). See the 
last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need to fix 
the red parts).
- Then with the STAMP mid-term review, a bigger urgency surfaced and I asked if 
we could instead focus on fixing tests as reported by Descartes to increase 
both coverage and mutation score (ie test quality), since those are 2 
metrics/KPIs measured by STAMP and since XWiki participates to STAMP we need to 
work on them and increase them substantially. See 
http://markmail.org/message/ejmdkf3hx7drkj52

The results of XWiki 10.7 has been quite poor on test improvements  (more focus 
on BFD than tests, lots of devs on holidays, etc). This forces us to have a 
different strategy.

Full Strategy proposal
=

1) As many XWiki SAS devs as possible (and anyone else from the community who’s 
interested ofc! :)) should spend 1 day per week working on improving STAMP 
metrics
* Currently the agreement is that Thomas and myself will do this for the 
foreseeable future till we get some good-enough metric progress
* Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM (Marius, 
Adel if he can, Simon in the future). The idea is to see where that could get 
us by using substantial manpower.

2) All committers: More generally the global TPC failure is also already active 
and dev need to modify modules that see their global TPC go down.

3) All committers: Of course, the jacoco strategy is also active at each module 
level.

STAMP tools
==

There are 4 tools developed by STAMP:
* Descartes: Improves quality of tests by increasing their mutation scores. See 
http://markmail.org/message/bonb5f7f37omnnog and also 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-03 Thread Thomas Mortagne
Sounds good.

On Mon, Sep 3, 2018 at 9:55 AM, Vincent Massol  wrote:
>
>> On 3 Sep 2018, at 09:55, Vincent Massol  wrote:
>>
>> I propose to do this tomorrow Tuesday, starting with an intro from me, using 
>> youtube live.
>
> Say, 10AM Paris time.
>
> Thanks
> -Vincent
>
>> WDYT?
>>
>> Thanks
>> -Vincent
>>
>>> On 30 Aug 2018, at 12:27, Adel Atallah  wrote:
>>>
>>> Just to be clear, when I proposed "having a whole day dedicated on
>>> using these tools", I didn't meant having to have it every week but
>>> only once, so we can properly start improving the tests. It would be
>>> some kind of training.
>>> On my side I don't think I'll be able to have on a week one day
>>> dedicated to tests and one for bug fixing, I won't have time left for
>>> the roadmap as I will only work on the product 50% of the time.
>>>
>>>
>>> On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:
 Hi,

 I don’t remember discussing this with you Thomas. Actually I’m not 
 convinced to have a fixed day:
 * we already have a fixed BFD and having a second one doesn’t leave much 
 flexibility for working on roadmap items when it’s the best
 * test sessions can be short (0.5-1 hours) and it’s easy to do them 
 between other tasks
 * it can be boring to spend a full day on them

 Now, I agree that not having a fixed day will make it hard to make sure 
 that we work 20% on that topic.

 So if you prefer we can define a day, knowing that some won’t be able to 
 always attend during that day and in this case they should do it on 
 another day. What’s important is to have 20% done each week (i.e. enough 
 work done on it).

 In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
 logical to me.

 WDYT? What do you prefer?

 Thanks
 -Vincent

> On 30 Aug 2018, at 10:38, Thomas Mortagne  
> wrote:
>
> Indeed we discussed this but I don't see it in your mail Vincent.
>
> On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  
> wrote:
>> Hello,
>>
>> Maybe we should agree on having a whole day dedicated on using these
>> tools with a maximum number of developers.
>> That way we will be able to help each other and maybe it will make the
>> process easier to carry out in the future.
>>
>> WDYT?
>>
>> Thanks,
>> Adel
>>
>>
>> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  
>> wrote:
>>> Hi devs (and anyone else interested to improve the tests of XWiki),
>>>
>>> History
>>> ==
>>>
>>> It all started when I analyzed our global TPC and found that it was 
>>> going down globally even though we have the 
>>> fail-build-on-jacoco-threshold strategy.
>>>
>>> I sent several email threads:
>>>
>>> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
>>> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
>>> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>>>
>>> Note: As a consequence of this last thread, I implemented a Jenkins 
>>> Pipeline to send us a mail when the global TPC of an XWiki module goes 
>>> down so that we fix it ASAP. This is still a development in progress. A 
>>> first version is done and running at 
>>> https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it and 
>>> fix it (it’s not working ATM).
>>>
>>> As a result of the global TPC going down/stagnating, I have proposed to 
>>> have 10.7 focused on Tests + BFD.
>>> - Initially I proposed to focus on increasing the global TPC by looking 
>>> at the reports from 1) above 
>>> (http://markmail.org/message/qjemnip7hjva2rjd). See the last report at 
>>> https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need to fix the 
>>> red parts).
>>> - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
>>> asked if we could instead focus on fixing tests as reported by 
>>> Descartes to increase both coverage and mutation score (ie test 
>>> quality), since those are 2 metrics/KPIs measured by STAMP and since 
>>> XWiki participates to STAMP we need to work on them and increase them 
>>> substantially. See http://markmail.org/message/ejmdkf3hx7drkj52
>>>
>>> The results of XWiki 10.7 has been quite poor on test improvements  
>>> (more focus on BFD than tests, lots of devs on holidays, etc). This 
>>> forces us to have a different strategy.
>>>
>>> Full Strategy proposal
>>> =
>>>
>>> 1) As many XWiki SAS devs as possible (and anyone else from the 
>>> community who’s interested ofc! :)) should spend 1 day per week working 
>>> on improving STAMP metrics
>>> * Currently the agreement is that Thomas and myself will do this for 
>>> the foreseeable future till we get some good-enough metric progress
>>> * 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-03 Thread Adel Atallah
+1


On Mon, Sep 3, 2018 at 9:55 AM, Vincent Massol  wrote:
>
>> On 3 Sep 2018, at 09:55, Vincent Massol  wrote:
>>
>> I propose to do this tomorrow Tuesday, starting with an intro from me, using 
>> youtube live.
>
> Say, 10AM Paris time.
>
> Thanks
> -Vincent
>
>> WDYT?
>>
>> Thanks
>> -Vincent
>>
>>> On 30 Aug 2018, at 12:27, Adel Atallah  wrote:
>>>
>>> Just to be clear, when I proposed "having a whole day dedicated on
>>> using these tools", I didn't meant having to have it every week but
>>> only once, so we can properly start improving the tests. It would be
>>> some kind of training.
>>> On my side I don't think I'll be able to have on a week one day
>>> dedicated to tests and one for bug fixing, I won't have time left for
>>> the roadmap as I will only work on the product 50% of the time.
>>>
>>>
>>> On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:
 Hi,

 I don’t remember discussing this with you Thomas. Actually I’m not 
 convinced to have a fixed day:
 * we already have a fixed BFD and having a second one doesn’t leave much 
 flexibility for working on roadmap items when it’s the best
 * test sessions can be short (0.5-1 hours) and it’s easy to do them 
 between other tasks
 * it can be boring to spend a full day on them

 Now, I agree that not having a fixed day will make it hard to make sure 
 that we work 20% on that topic.

 So if you prefer we can define a day, knowing that some won’t be able to 
 always attend during that day and in this case they should do it on 
 another day. What’s important is to have 20% done each week (i.e. enough 
 work done on it).

 In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
 logical to me.

 WDYT? What do you prefer?

 Thanks
 -Vincent

> On 30 Aug 2018, at 10:38, Thomas Mortagne  
> wrote:
>
> Indeed we discussed this but I don't see it in your mail Vincent.
>
> On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  
> wrote:
>> Hello,
>>
>> Maybe we should agree on having a whole day dedicated on using these
>> tools with a maximum number of developers.
>> That way we will be able to help each other and maybe it will make the
>> process easier to carry out in the future.
>>
>> WDYT?
>>
>> Thanks,
>> Adel
>>
>>
>> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  
>> wrote:
>>> Hi devs (and anyone else interested to improve the tests of XWiki),
>>>
>>> History
>>> ==
>>>
>>> It all started when I analyzed our global TPC and found that it was 
>>> going down globally even though we have the 
>>> fail-build-on-jacoco-threshold strategy.
>>>
>>> I sent several email threads:
>>>
>>> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
>>> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
>>> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>>>
>>> Note: As a consequence of this last thread, I implemented a Jenkins 
>>> Pipeline to send us a mail when the global TPC of an XWiki module goes 
>>> down so that we fix it ASAP. This is still a development in progress. A 
>>> first version is done and running at 
>>> https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it and 
>>> fix it (it’s not working ATM).
>>>
>>> As a result of the global TPC going down/stagnating, I have proposed to 
>>> have 10.7 focused on Tests + BFD.
>>> - Initially I proposed to focus on increasing the global TPC by looking 
>>> at the reports from 1) above 
>>> (http://markmail.org/message/qjemnip7hjva2rjd). See the last report at 
>>> https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need to fix the 
>>> red parts).
>>> - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
>>> asked if we could instead focus on fixing tests as reported by 
>>> Descartes to increase both coverage and mutation score (ie test 
>>> quality), since those are 2 metrics/KPIs measured by STAMP and since 
>>> XWiki participates to STAMP we need to work on them and increase them 
>>> substantially. See http://markmail.org/message/ejmdkf3hx7drkj52
>>>
>>> The results of XWiki 10.7 has been quite poor on test improvements  
>>> (more focus on BFD than tests, lots of devs on holidays, etc). This 
>>> forces us to have a different strategy.
>>>
>>> Full Strategy proposal
>>> =
>>>
>>> 1) As many XWiki SAS devs as possible (and anyone else from the 
>>> community who’s interested ofc! :)) should spend 1 day per week working 
>>> on improving STAMP metrics
>>> * Currently the agreement is that Thomas and myself will do this for 
>>> the foreseeable future till we get some good-enough metric progress
>>> * Some other 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-03 Thread Vincent Massol


> On 3 Sep 2018, at 09:55, Vincent Massol  wrote:
> 
> I propose to do this tomorrow Tuesday, starting with an intro from me, using 
> youtube live.

Say, 10AM Paris time.

Thanks
-Vincent

> WDYT?
> 
> Thanks
> -Vincent
> 
>> On 30 Aug 2018, at 12:27, Adel Atallah  wrote:
>> 
>> Just to be clear, when I proposed "having a whole day dedicated on
>> using these tools", I didn't meant having to have it every week but
>> only once, so we can properly start improving the tests. It would be
>> some kind of training.
>> On my side I don't think I'll be able to have on a week one day
>> dedicated to tests and one for bug fixing, I won't have time left for
>> the roadmap as I will only work on the product 50% of the time.
>> 
>> 
>> On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:
>>> Hi,
>>> 
>>> I don’t remember discussing this with you Thomas. Actually I’m not 
>>> convinced to have a fixed day:
>>> * we already have a fixed BFD and having a second one doesn’t leave much 
>>> flexibility for working on roadmap items when it’s the best
>>> * test sessions can be short (0.5-1 hours) and it’s easy to do them between 
>>> other tasks
>>> * it can be boring to spend a full day on them
>>> 
>>> Now, I agree that not having a fixed day will make it hard to make sure 
>>> that we work 20% on that topic.
>>> 
>>> So if you prefer we can define a day, knowing that some won’t be able to 
>>> always attend during that day and in this case they should do it on another 
>>> day. What’s important is to have 20% done each week (i.e. enough work done 
>>> on it).
>>> 
>>> In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
>>> logical to me.
>>> 
>>> WDYT? What do you prefer?
>>> 
>>> Thanks
>>> -Vincent
>>> 
 On 30 Aug 2018, at 10:38, Thomas Mortagne  
 wrote:
 
 Indeed we discussed this but I don't see it in your mail Vincent.
 
 On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  
 wrote:
> Hello,
> 
> Maybe we should agree on having a whole day dedicated on using these
> tools with a maximum number of developers.
> That way we will be able to help each other and maybe it will make the
> process easier to carry out in the future.
> 
> WDYT?
> 
> Thanks,
> Adel
> 
> 
> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  
> wrote:
>> Hi devs (and anyone else interested to improve the tests of XWiki),
>> 
>> History
>> ==
>> 
>> It all started when I analyzed our global TPC and found that it was 
>> going down globally even though we have the 
>> fail-build-on-jacoco-threshold strategy.
>> 
>> I sent several email threads:
>> 
>> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
>> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
>> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>> 
>> Note: As a consequence of this last thread, I implemented a Jenkins 
>> Pipeline to send us a mail when the global TPC of an XWiki module goes 
>> down so that we fix it ASAP. This is still a development in progress. A 
>> first version is done and running at 
>> https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it and 
>> fix it (it’s not working ATM).
>> 
>> As a result of the global TPC going down/stagnating, I have proposed to 
>> have 10.7 focused on Tests + BFD.
>> - Initially I proposed to focus on increasing the global TPC by looking 
>> at the reports from 1) above 
>> (http://markmail.org/message/qjemnip7hjva2rjd). See the last report at 
>> https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need to fix the red 
>> parts).
>> - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
>> asked if we could instead focus on fixing tests as reported by Descartes 
>> to increase both coverage and mutation score (ie test quality), since 
>> those are 2 metrics/KPIs measured by STAMP and since XWiki participates 
>> to STAMP we need to work on them and increase them substantially. See 
>> http://markmail.org/message/ejmdkf3hx7drkj52
>> 
>> The results of XWiki 10.7 has been quite poor on test improvements  
>> (more focus on BFD than tests, lots of devs on holidays, etc). This 
>> forces us to have a different strategy.
>> 
>> Full Strategy proposal
>> =
>> 
>> 1) As many XWiki SAS devs as possible (and anyone else from the 
>> community who’s interested ofc! :)) should spend 1 day per week working 
>> on improving STAMP metrics
>> * Currently the agreement is that Thomas and myself will do this for the 
>> foreseeable future till we get some good-enough metric progress
>> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
>> (Marius, Adel if he can, Simon in the future). The idea is to see where 
>> that could 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-09-03 Thread Vincent Massol
I propose to do this tomorrow Tuesday, starting with an intro from me, using 
youtube live.

WDYT?

Thanks
-Vincent

> On 30 Aug 2018, at 12:27, Adel Atallah  wrote:
> 
> Just to be clear, when I proposed "having a whole day dedicated on
> using these tools", I didn't meant having to have it every week but
> only once, so we can properly start improving the tests. It would be
> some kind of training.
> On my side I don't think I'll be able to have on a week one day
> dedicated to tests and one for bug fixing, I won't have time left for
> the roadmap as I will only work on the product 50% of the time.
> 
> 
> On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:
>> Hi,
>> 
>> I don’t remember discussing this with you Thomas. Actually I’m not convinced 
>> to have a fixed day:
>> * we already have a fixed BFD and having a second one doesn’t leave much 
>> flexibility for working on roadmap items when it’s the best
>> * test sessions can be short (0.5-1 hours) and it’s easy to do them between 
>> other tasks
>> * it can be boring to spend a full day on them
>> 
>> Now, I agree that not having a fixed day will make it hard to make sure that 
>> we work 20% on that topic.
>> 
>> So if you prefer we can define a day, knowing that some won’t be able to 
>> always attend during that day and in this case they should do it on another 
>> day. What’s important is to have 20% done each week (i.e. enough work done 
>> on it).
>> 
>> In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
>> logical to me.
>> 
>> WDYT? What do you prefer?
>> 
>> Thanks
>> -Vincent
>> 
>>> On 30 Aug 2018, at 10:38, Thomas Mortagne  wrote:
>>> 
>>> Indeed we discussed this but I don't see it in your mail Vincent.
>>> 
>>> On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  
>>> wrote:
 Hello,
 
 Maybe we should agree on having a whole day dedicated on using these
 tools with a maximum number of developers.
 That way we will be able to help each other and maybe it will make the
 process easier to carry out in the future.
 
 WDYT?
 
 Thanks,
 Adel
 
 
 On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  
 wrote:
> Hi devs (and anyone else interested to improve the tests of XWiki),
> 
> History
> ==
> 
> It all started when I analyzed our global TPC and found that it was going 
> down globally even though we have the fail-build-on-jacoco-threshold 
> strategy.
> 
> I sent several email threads:
> 
> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
> 
> Note: As a consequence of this last thread, I implemented a Jenkins 
> Pipeline to send us a mail when the global TPC of an XWiki module goes 
> down so that we fix it ASAP. This is still a development in progress. A 
> first version is done and running at 
> https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it and 
> fix it (it’s not working ATM).
> 
> As a result of the global TPC going down/stagnating, I have proposed to 
> have 10.7 focused on Tests + BFD.
> - Initially I proposed to focus on increasing the global TPC by looking 
> at the reports from 1) above 
> (http://markmail.org/message/qjemnip7hjva2rjd). See the last report at 
> https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need to fix the red 
> parts).
> - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
> asked if we could instead focus on fixing tests as reported by Descartes 
> to increase both coverage and mutation score (ie test quality), since 
> those are 2 metrics/KPIs measured by STAMP and since XWiki participates 
> to STAMP we need to work on them and increase them substantially. See 
> http://markmail.org/message/ejmdkf3hx7drkj52
> 
> The results of XWiki 10.7 has been quite poor on test improvements  (more 
> focus on BFD than tests, lots of devs on holidays, etc). This forces us 
> to have a different strategy.
> 
> Full Strategy proposal
> =
> 
> 1) As many XWiki SAS devs as possible (and anyone else from the community 
> who’s interested ofc! :)) should spend 1 day per week working on 
> improving STAMP metrics
> * Currently the agreement is that Thomas and myself will do this for the 
> foreseeable future till we get some good-enough metric progress
> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
> (Marius, Adel if he can, Simon in the future). The idea is to see where 
> that could get us by using substantial manpower.
> 
> 2) All committers: More generally the global TPC failure is also already 
> active and dev need to modify modules that see their global TPC go down.
> 
> 3) All 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-08-30 Thread Adel Atallah
Just to be clear, when I proposed "having a whole day dedicated on
using these tools", I didn't meant having to have it every week but
only once, so we can properly start improving the tests. It would be
some kind of training.
On my side I don't think I'll be able to have on a week one day
dedicated to tests and one for bug fixing, I won't have time left for
the roadmap as I will only work on the product 50% of the time.


On Thu, Aug 30, 2018 at 12:18 PM, Vincent Massol  wrote:
> Hi,
>
> I don’t remember discussing this with you Thomas. Actually I’m not convinced 
> to have a fixed day:
> * we already have a fixed BFD and having a second one doesn’t leave much 
> flexibility for working on roadmap items when it’s the best
> * test sessions can be short (0.5-1 hours) and it’s easy to do them between 
> other tasks
> * it can be boring to spend a full day on them
>
> Now, I agree that not having a fixed day will make it hard to make sure that 
> we work 20% on that topic.
>
> So if you prefer we can define a day, knowing that some won’t be able to 
> always attend during that day and in this case they should do it on another 
> day. What’s important is to have 20% done each week (i.e. enough work done on 
> it).
>
> In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
> logical to me.
>
> WDYT? What do you prefer?
>
> Thanks
> -Vincent
>
>> On 30 Aug 2018, at 10:38, Thomas Mortagne  wrote:
>>
>> Indeed we discussed this but I don't see it in your mail Vincent.
>>
>> On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  
>> wrote:
>>> Hello,
>>>
>>> Maybe we should agree on having a whole day dedicated on using these
>>> tools with a maximum number of developers.
>>> That way we will be able to help each other and maybe it will make the
>>> process easier to carry out in the future.
>>>
>>> WDYT?
>>>
>>> Thanks,
>>> Adel
>>>
>>>
>>> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  wrote:
 Hi devs (and anyone else interested to improve the tests of XWiki),

 History
 ==

 It all started when I analyzed our global TPC and found that it was going 
 down globally even though we have the fail-build-on-jacoco-threshold 
 strategy.

 I sent several email threads:

 - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
 - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
 - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7

 Note: As a consequence of this last thread, I implemented a Jenkins 
 Pipeline to send us a mail when the global TPC of an XWiki module goes 
 down so that we fix it ASAP. This is still a development in progress. A 
 first version is done and running at 
 https://ci.xwiki.org/view/Tools/job/Clover/ but I need to debug it and fix 
 it (it’s not working ATM).

 As a result of the global TPC going down/stagnating, I have proposed to 
 have 10.7 focused on Tests + BFD.
 - Initially I proposed to focus on increasing the global TPC by looking at 
 the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). 
 See the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw 
 (we need to fix the red parts).
 - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
 asked if we could instead focus on fixing tests as reported by Descartes 
 to increase both coverage and mutation score (ie test quality), since 
 those are 2 metrics/KPIs measured by STAMP and since XWiki participates to 
 STAMP we need to work on them and increase them substantially. See 
 http://markmail.org/message/ejmdkf3hx7drkj52

 The results of XWiki 10.7 has been quite poor on test improvements  (more 
 focus on BFD than tests, lots of devs on holidays, etc). This forces us to 
 have a different strategy.

 Full Strategy proposal
 =

 1) As many XWiki SAS devs as possible (and anyone else from the community 
 who’s interested ofc! :)) should spend 1 day per week working on improving 
 STAMP metrics
 * Currently the agreement is that Thomas and myself will do this for the 
 foreseeable future till we get some good-enough metric progress
 * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
 (Marius, Adel if he can, Simon in the future). The idea is to see where 
 that could get us by using substantial manpower.

 2) All committers: More generally the global TPC failure is also already 
 active and dev need to modify modules that see their global TPC go down.

 3) All committers: Of course, the jacoco strategy is also active at each 
 module level.

 STAMP tools
 ==

 There are 4 tools developed by STAMP:
 * Descartes: Improves quality of tests by increasing their mutation 
 scores. See http://markmail.org/message/bonb5f7f37omnnog and also 
 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-08-30 Thread Vincent Massol
Hi,

I don’t remember discussing this with you Thomas. Actually I’m not convinced to 
have a fixed day:
* we already have a fixed BFD and having a second one doesn’t leave much 
flexibility for working on roadmap items when it’s the best
* test sessions can be short (0.5-1 hours) and it’s easy to do them between 
other tasks
* it can be boring to spend a full day on them

Now, I agree that not having a fixed day will make it hard to make sure that we 
work 20% on that topic.

So if you prefer we can define a day, knowing that some won’t be able to always 
attend during that day and in this case they should do it on another day. 
What’s important is to have 20% done each week (i.e. enough work done on it).

In term of day, if we have to choose one, I’d say Tuesday. That’s the most 
logical to me.

WDYT? What do you prefer?

Thanks
-Vincent

> On 30 Aug 2018, at 10:38, Thomas Mortagne  wrote:
> 
> Indeed we discussed this but I don't see it in your mail Vincent.
> 
> On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  wrote:
>> Hello,
>> 
>> Maybe we should agree on having a whole day dedicated on using these
>> tools with a maximum number of developers.
>> That way we will be able to help each other and maybe it will make the
>> process easier to carry out in the future.
>> 
>> WDYT?
>> 
>> Thanks,
>> Adel
>> 
>> 
>> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  wrote:
>>> Hi devs (and anyone else interested to improve the tests of XWiki),
>>> 
>>> History
>>> ==
>>> 
>>> It all started when I analyzed our global TPC and found that it was going 
>>> down globally even though we have the fail-build-on-jacoco-threshold 
>>> strategy.
>>> 
>>> I sent several email threads:
>>> 
>>> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
>>> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
>>> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>>> 
>>> Note: As a consequence of this last thread, I implemented a Jenkins 
>>> Pipeline to send us a mail when the global TPC of an XWiki module goes down 
>>> so that we fix it ASAP. This is still a development in progress. A first 
>>> version is done and running at https://ci.xwiki.org/view/Tools/job/Clover/ 
>>> but I need to debug it and fix it (it’s not working ATM).
>>> 
>>> As a result of the global TPC going down/stagnating, I have proposed to 
>>> have 10.7 focused on Tests + BFD.
>>> - Initially I proposed to focus on increasing the global TPC by looking at 
>>> the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). 
>>> See the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we 
>>> need to fix the red parts).
>>> - Then with the STAMP mid-term review, a bigger urgency surfaced and I 
>>> asked if we could instead focus on fixing tests as reported by Descartes to 
>>> increase both coverage and mutation score (ie test quality), since those 
>>> are 2 metrics/KPIs measured by STAMP and since XWiki participates to STAMP 
>>> we need to work on them and increase them substantially. See 
>>> http://markmail.org/message/ejmdkf3hx7drkj52
>>> 
>>> The results of XWiki 10.7 has been quite poor on test improvements  (more 
>>> focus on BFD than tests, lots of devs on holidays, etc). This forces us to 
>>> have a different strategy.
>>> 
>>> Full Strategy proposal
>>> =
>>> 
>>> 1) As many XWiki SAS devs as possible (and anyone else from the community 
>>> who’s interested ofc! :)) should spend 1 day per week working on improving 
>>> STAMP metrics
>>> * Currently the agreement is that Thomas and myself will do this for the 
>>> foreseeable future till we get some good-enough metric progress
>>> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
>>> (Marius, Adel if he can, Simon in the future). The idea is to see where 
>>> that could get us by using substantial manpower.
>>> 
>>> 2) All committers: More generally the global TPC failure is also already 
>>> active and dev need to modify modules that see their global TPC go down.
>>> 
>>> 3) All committers: Of course, the jacoco strategy is also active at each 
>>> module level.
>>> 
>>> STAMP tools
>>> ==
>>> 
>>> There are 4 tools developed by STAMP:
>>> * Descartes: Improves quality of tests by increasing their mutation scores. 
>>> See http://markmail.org/message/bonb5f7f37omnnog and also 
>>> https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
>>> * DSpot: Automatically generate new tests, based on existing tests. See 
>>> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot
>>> * CAMP: Takes a Dockerfile and generates mutations of it, then deploys and 
>>> execute tests on the software to see if the mutation works or not. Note 
>>> this is currently not fitting the need of XWiki and thus I’ve been 
>>> developing another tool as an experiment (which may go back in CAMP one 
>>> day), based on TestContainers, see 
>>> 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-08-30 Thread Thomas Mortagne
Indeed we discussed this but I don't see it in your mail Vincent.

On Thu, Aug 30, 2018 at 10:33 AM, Adel Atallah  wrote:
> Hello,
>
> Maybe we should agree on having a whole day dedicated on using these
> tools with a maximum number of developers.
> That way we will be able to help each other and maybe it will make the
> process easier to carry out in the future.
>
> WDYT?
>
> Thanks,
> Adel
>
>
> On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  wrote:
>> Hi devs (and anyone else interested to improve the tests of XWiki),
>>
>> History
>> ==
>>
>> It all started when I analyzed our global TPC and found that it was going 
>> down globally even though we have the fail-build-on-jacoco-threshold 
>> strategy.
>>
>> I sent several email threads:
>>
>> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
>> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
>> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>>
>> Note: As a consequence of this last thread, I implemented a Jenkins Pipeline 
>> to send us a mail when the global TPC of an XWiki module goes down so that 
>> we fix it ASAP. This is still a development in progress. A first version is 
>> done and running at https://ci.xwiki.org/view/Tools/job/Clover/ but I need 
>> to debug it and fix it (it’s not working ATM).
>>
>> As a result of the global TPC going down/stagnating, I have proposed to have 
>> 10.7 focused on Tests + BFD.
>> - Initially I proposed to focus on increasing the global TPC by looking at 
>> the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). 
>> See the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we 
>> need to fix the red parts).
>> - Then with the STAMP mid-term review, a bigger urgency surfaced and I asked 
>> if we could instead focus on fixing tests as reported by Descartes to 
>> increase both coverage and mutation score (ie test quality), since those are 
>> 2 metrics/KPIs measured by STAMP and since XWiki participates to STAMP we 
>> need to work on them and increase them substantially. See 
>> http://markmail.org/message/ejmdkf3hx7drkj52
>>
>> The results of XWiki 10.7 has been quite poor on test improvements  (more 
>> focus on BFD than tests, lots of devs on holidays, etc). This forces us to 
>> have a different strategy.
>>
>> Full Strategy proposal
>> =
>>
>> 1) As many XWiki SAS devs as possible (and anyone else from the community 
>> who’s interested ofc! :)) should spend 1 day per week working on improving 
>> STAMP metrics
>> * Currently the agreement is that Thomas and myself will do this for the 
>> foreseeable future till we get some good-enough metric progress
>> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
>> (Marius, Adel if he can, Simon in the future). The idea is to see where that 
>> could get us by using substantial manpower.
>>
>> 2) All committers: More generally the global TPC failure is also already 
>> active and dev need to modify modules that see their global TPC go down.
>>
>> 3) All committers: Of course, the jacoco strategy is also active at each 
>> module level.
>>
>> STAMP tools
>> ==
>>
>> There are 4 tools developed by STAMP:
>> * Descartes: Improves quality of tests by increasing their mutation scores. 
>> See http://markmail.org/message/bonb5f7f37omnnog and also 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
>> * DSpot: Automatically generate new tests, based on existing tests. See 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot
>> * CAMP: Takes a Dockerfile and generates mutations of it, then deploys and 
>> execute tests on the software to see if the mutation works or not. Note this 
>> is currently not fitting the need of XWiki and thus I’ve been developing 
>> another tool as an experiment (which may go back in CAMP one day), based on 
>> TestContainers, see 
>> https://massol.myxwiki.org/xwiki/bin/view/Blog/EnvironmentTestingExperimentations
>> * EvoCrash: Takes a stack trace from production logs and generates a test 
>> that, when executed, reproduces the crash. See 
>> https://markmail.org/message/v74g3tsmflquqwra. See also 
>> https://github.com/SERG-Delft/EvoCrash
>>
>> Since XWiki is part of the STAMP research project, we need to use those 4 
>> tools to increase the KPIs associated with the tools. See below.
>>
>> Objectives/KPIs/Metrics for STAMP
>> ===
>>
>> The STAMP project has defined 9 KPIs that all partners (and thus XWiki) need 
>> to work on:
>>
>> 1) K01: Increase test coverage
>> * Global increase by reducing by 40% the non-covered code. For XWiki since 
>> we’re at about 70%, this means reaching about 80% before the end of STAMP 
>> (ie. before end of 2019)
>> * Increase the coverage contributions of each tool developed by STAMP.
>>
>> Strategy:
>> * Primary goal:
>> ** Increase coverage by executing Descartes and improving our tests. This is 
>> 

Re: [xwiki-devs] [STAMP/Test] Metrics we need to improve + strategy

2018-08-30 Thread Adel Atallah
Hello,

Maybe we should agree on having a whole day dedicated on using these
tools with a maximum number of developers.
That way we will be able to help each other and maybe it will make the
process easier to carry out in the future.

WDYT?

Thanks,
Adel


On Wed, Aug 29, 2018 at 11:20 AM, Vincent Massol  wrote:
> Hi devs (and anyone else interested to improve the tests of XWiki),
>
> History
> ==
>
> It all started when I analyzed our global TPC and found that it was going 
> down globally even though we have the fail-build-on-jacoco-threshold strategy.
>
> I sent several email threads:
>
> - Loss of TPC: http://markmail.org/message/hqumkdiz7jm76ya6
> - TPC evolution: http://markmail.org/message/up2gc2zzbbe4uqgn
> - Improve our TPC strategy: http://markmail.org/message/grphwta63pp5p4l7
>
> Note: As a consequence of this last thread, I implemented a Jenkins Pipeline 
> to send us a mail when the global TPC of an XWiki module goes down so that we 
> fix it ASAP. This is still a development in progress. A first version is done 
> and running at https://ci.xwiki.org/view/Tools/job/Clover/ but I need to 
> debug it and fix it (it’s not working ATM).
>
> As a result of the global TPC going down/stagnating, I have proposed to have 
> 10.7 focused on Tests + BFD.
> - Initially I proposed to focus on increasing the global TPC by looking at 
> the reports from 1) above (http://markmail.org/message/qjemnip7hjva2rjd). See 
> the last report at https://up1.xwikisas.com/#mJ0loeB6nBrAgYeKA7MGGw (we need 
> to fix the red parts).
> - Then with the STAMP mid-term review, a bigger urgency surfaced and I asked 
> if we could instead focus on fixing tests as reported by Descartes to 
> increase both coverage and mutation score (ie test quality), since those are 
> 2 metrics/KPIs measured by STAMP and since XWiki participates to STAMP we 
> need to work on them and increase them substantially. See 
> http://markmail.org/message/ejmdkf3hx7drkj52
>
> The results of XWiki 10.7 has been quite poor on test improvements  (more 
> focus on BFD than tests, lots of devs on holidays, etc). This forces us to 
> have a different strategy.
>
> Full Strategy proposal
> =
>
> 1) As many XWiki SAS devs as possible (and anyone else from the community 
> who’s interested ofc! :)) should spend 1 day per week working on improving 
> STAMP metrics
> * Currently the agreement is that Thomas and myself will do this for the 
> foreseeable future till we get some good-enough metric progress
> * Some other devs from XWiki SAS will help out for XWiki 10.8 only FTM 
> (Marius, Adel if he can, Simon in the future). The idea is to see where that 
> could get us by using substantial manpower.
>
> 2) All committers: More generally the global TPC failure is also already 
> active and dev need to modify modules that see their global TPC go down.
>
> 3) All committers: Of course, the jacoco strategy is also active at each 
> module level.
>
> STAMP tools
> ==
>
> There are 4 tools developed by STAMP:
> * Descartes: Improves quality of tests by increasing their mutation scores. 
> See http://markmail.org/message/bonb5f7f37omnnog and also 
> https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
> * DSpot: Automatically generate new tests, based on existing tests. See 
> https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot
> * CAMP: Takes a Dockerfile and generates mutations of it, then deploys and 
> execute tests on the software to see if the mutation works or not. Note this 
> is currently not fitting the need of XWiki and thus I’ve been developing 
> another tool as an experiment (which may go back in CAMP one day), based on 
> TestContainers, see 
> https://massol.myxwiki.org/xwiki/bin/view/Blog/EnvironmentTestingExperimentations
> * EvoCrash: Takes a stack trace from production logs and generates a test 
> that, when executed, reproduces the crash. See 
> https://markmail.org/message/v74g3tsmflquqwra. See also 
> https://github.com/SERG-Delft/EvoCrash
>
> Since XWiki is part of the STAMP research project, we need to use those 4 
> tools to increase the KPIs associated with the tools. See below.
>
> Objectives/KPIs/Metrics for STAMP
> ===
>
> The STAMP project has defined 9 KPIs that all partners (and thus XWiki) need 
> to work on:
>
> 1) K01: Increase test coverage
> * Global increase by reducing by 40% the non-covered code. For XWiki since 
> we’re at about 70%, this means reaching about 80% before the end of STAMP 
> (ie. before end of 2019)
> * Increase the coverage contributions of each tool developed by STAMP.
>
> Strategy:
> * Primary goal:
> ** Increase coverage by executing Descartes and improving our tests. This is 
> http://markmail.org/message/ejmdkf3hx7drkj52
> ** Don’t do anything with DSpot. I’ll do that part. Note that the goal is to 
> write a Jenkins pipeline to automatically execute DSpot from time to time and 
> commit the generated tests in a separate