Greetings, all,

Since Cedric is going to be starting to use the BuildAnalysis mechanism in earnest within a few weeks, I wanted to provide some feedback about the current version of the system and some suggestions for improvement.

First off, lets think about the basic questions that the BuildAnalysis mechanism should help us answer in the event of a daily build failure. I think there are three:

(1) Who was responsible for breaking the build?
   Example answer: johnson

(2) Why did the build break?
   Example answer: junit failure in hackyCore_Kernel

(3) What might have prevented the build from breaking?
   Example answer: ant freshStart hackyCore_Kernel.junit

The reason these three answers (i.e. data) are useful is because they give rise naturally to three telemetry streams that we can use to understand any progress that we're making. The first question can show us "personal" trends in how often we break the build; the second question can show us trends in how the build broke, and the third shows us trends in what preventative measure could have been used but wasn't.

So, my first suggestion is to reorganize the output of the build analysis so that the email alert consists of answers to these three questions. The URL that you click on brings up a page containing a representation of all of the data that was used to derive the answers. Put another way, the build analysis mechanism has two components:

- a data gathering component that looks at all of the sensor data for a project for a given day (or more!) and creates a "BuildAnalysis" instance.

- a reasoning mechanism that operates on that data to determine its best guess at the answers to the three questions above. The reasoning mechanism augments the BuildAnalysis instance with information about how it reached its conclusions.

When you click on the URL in the email, you get taken to a page that displays both the data that was collected and used in the reasoning process, as well as some kind of description of what reasoning was applied and how it resulted in the answers to the three questions.

OK, so far, so good. But there's another issue that must be resolved in order to make this mechanism useful in our environment. And that is the fact that the reasoner may not always come to the right answer for one or more of the three questions, or in some cases it may not come up with an answer at all. Therefore, there needs to be a way for a developer to "correct" the answers provided by the reasoner. Otherwise, the telemetry streams are useless because they won't reflect the true answers to the questions.

Because of this, the URL that you click on to see the details on build analysis must do more than simply show you the data; it must also provide you with the ability to manually change the answers associated with the three questions (by way of a pull-down list of possible answers). And these answers are persisted and used when generating the telemetry streams. Through this technique, one can even keep track of when an answer provided by the developer is different than the answer provided by the reasoner, and thus calculate the accuracy of the mechanism and how that changes over time.

How does this sound?

Cheers,
Philip

Reply via email to