[HACKYSTAT-DEV-L:184] Telemetry analysis proposal: why does the daily build fail, and what should we do about it?

Philip Johnson Wed, 13 Oct 2004 11:55:37 -0700

Hi Cedric,

Here's a telemetry project that will provide some interesting insights into our software development process and push the technology forward.

Last night, the build failed. As I noted in my email, it caused me to start wondering about whether we should be changing our process (i.e. add the checkstyle target to quickStart) in order to prevent these kinds of problems from occuring in the future. Now, this raises all sorts of interesting empirical questions:

1. Would adding checkstyle to quickStart actually impact on the daily build failures? 2. Are checkstyle failures a significant source of daily build failures? 3. Is there some other class of failure (junit? compile?) that is a more significant source of daily build failures? 4. Are there some modules that are more prone to daily build failures than others? 5. Can we use Active Time/Commit data to see if there are certain developers more prone to creating daily build failures? 6. Is there some other kind of process change that would be more effective at reducing daily build failures? For example, are people running junit tests appropriately before doing commits? 7. What are the trends in daily build failures? Are there certain classes of failures that are becoming more prevalent? Less prevalent? Are there certain modules that are becoming more prone to failing the daily build? Less prone? Are there certain developers that are becoming more prone? Less prone?

I propose that you do an investigation of these questions for next Wednesday's meeting. There are a number of issues I can already see that you'll need to address:

(1) Missing build failure cause data. I am not sure that we have good records of why the build failed (i.e. checkstyle, junit, compile, javadoc, etc.) If not, you may need to manually improve the data by checking out the configuration on days that failure occurs, reproducing the problem locally, and then updating the raw sensor data by hand.

(2) Determining the failure category from the build data. The raw data I think simply reports the string passed by Ant. You'll need to add some processing to the DailyProjectData subclass for Builds to analyze that string and report the 'category' that it belongs to.

(3) Determining the module(s) involved in the failure from the build data. Similarly, the DailyProjectData subclass for Builds should be able at least in some cases to determine which module(s) failed the build.

(4) Determine the developer (potentially) involved in the failure. The conservative approach is to identify the developer(s) who committed anything since the last successful daily build. This approach has the potential for false positives--developers are identified who weren't actually involved in the failure. Another approach is to identify only the developer(s) who committed to the module that failed since that last successful daily build. This will have more false negatives: developers who were involved would not be noted. (For example, if an API change in one module causes a failure in another.)

(5) Determine if there is a relationship between local testing and the daily build failure. Perhaps people are not testing the code before committing it? To see this, we need to ensure that people have their local junit sensor enabled and data is being sent. To ensure that this is so, we need a way to identify which developers appear to be sending data of a given type and which aren't.

(6) All of these analyses are based upon the concept of telemetry applied to 'top-level modules'. The reduction functions need to be extended to support (via templates?) the specification of one, a subset, or all top-level modules.

You may come up with other issues as you work on this, of course.

I think this will be a very fruitful inquiry for you. It involves at least three types of raw sensor data (commit, build, and junit), it pushes forward the representations and analyses; the presentation of the data will certainly be aided by use of the telemetry wall, and it will help us to address a practical and important problem for our group which is also a very generic problem for the software development community as a whole.

Cheers,
Philip

[HACKYSTAT-DEV-L:184] Telemetry analysis proposal: why does the daily build fail, and what should we do about it?

Reply via email to