[jira] [Updated] (CASSANDRA-8954) risk analysis of patches based on past defects

Russ Hatch (JIRA) Fri, 08 May 2015 12:39:00 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Russ Hatch updated CASSANDRA-8954:
----------------------------------
    Description: 
Some changes to source are much more risky than others, and we can analyze data 
from JIRA + git to make educated guesses about risk level. This is a backwards 
looking technique with limitations but still may be useful (yes, the past does 
not equal the future!).

(disclaimer: I did not come up with this technique).

The executive summary: 1) correlate changes with defects, by code unit such as 
filename 2) quantify risk of new patches by combining correlation with a 
measure of change "size", as (correlation * change_size)

The basic idea is to build a tool which correlates past Defect tickets to the 
files which were changed to fix them. If a Defect required changes to specific 
files to fix, then in some sense past changes to those files (or their original 
implementations) were problematic. Therefore, future changes to those files 
carry some potential risk as well.

This requires getting an occasional dump of Defect type issues, and an 
occasional dump of commit messages. Defects would have to be associated to 
commits based on a text search of commit messages. From there we build a 
weighted model of which source files get touched the most to fix defects (say 
giving each file name a ranking of 1 to 10 where 10 carries the most risk).

To analyze specific patches going forward we look at the defect weight for that 
source file, and factor in a metric for a patch's changes in that file (maybe 
(lines changed/total lines), OR (change in cyclomatic complexity/total 
complexity)). Out of this we get a number representing a theoretical risk.

  was:
Some changes to source are much more risky than others, and we can analyze data 
from JIRA + git to make educated guesses about risk level. This is a backwards 
looking technique with limitations but still may be useful (yes, the past does 
not equal the future!).

(disclaimer: I did not come up with this technique).

The basic idea is to build a tool which correlates past Defect tickets to the 
files which were changed to fix them. If a Defect required changes to specific 
files to fix, then in some sense past changes to those files (or their original 
implementations) were problematic. Therefore, future changes to those files 
carry some potential risk as well.

This requires getting an occasional dump of Defect type issues, and an 
occasional dump of commit messages. Defects would have to be associated to 
commits based on a text search of commit messages. From there we build a 
weighted model of which source files get touched the most to fix defects (say 
giving each file name a ranking of 1 to 10 where 10 carries the most risk).

To analyze specific patches going forward we look at the defect weight for that 
source file, and factor in a metric for a patch's changes in that file (maybe 
(lines changed/total lines), OR (change in cyclomatic complexity/total 
complexity)). Out of this we get a number representing a theoretical risk.


> risk analysis of patches based on past defects
> ----------------------------------------------
>
>                 Key: CASSANDRA-8954
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8954
>             Project: Cassandra
>          Issue Type: Test
>            Reporter: Russ Hatch
>            Assignee: Russ Hatch
>
> Some changes to source are much more risky than others, and we can analyze 
> data from JIRA + git to make educated guesses about risk level. This is a 
> backwards looking technique with limitations but still may be useful (yes, 
> the past does not equal the future!).
> (disclaimer: I did not come up with this technique).
> The executive summary: 1) correlate changes with defects, by code unit such 
> as filename 2) quantify risk of new patches by combining correlation with a 
> measure of change "size", as (correlation * change_size)
> The basic idea is to build a tool which correlates past Defect tickets to the 
> files which were changed to fix them. If a Defect required changes to 
> specific files to fix, then in some sense past changes to those files (or 
> their original implementations) were problematic. Therefore, future changes 
> to those files carry some potential risk as well.
> This requires getting an occasional dump of Defect type issues, and an 
> occasional dump of commit messages. Defects would have to be associated to 
> commits based on a text search of commit messages. From there we build a 
> weighted model of which source files get touched the most to fix defects (say 
> giving each file name a ranking of 1 to 10 where 10 carries the most risk).
> To analyze specific patches going forward we look at the defect weight for 
> that source file, and factor in a metric for a patch's changes in that file 
> (maybe (lines changed/total lines), OR (change in cyclomatic complexity/total 
> complexity)). Out of this we get a number representing a theoretical risk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8954) risk analysis of patches based on past defects

Reply via email to