Hey Guys,
This email contains comments on a wide variety of things. It is at least ordered by "severity": read on if you dare. by the way.. if any of these ideas are good, i'm not volunteering for all of them... :)
- In a previous email, I suggested that we should be running the Daily Build everyday. Instead, of checking to see if code changed and then running the build. I think my reasons in the previous email are justified. Anyway, I just figured out another reason why we should be doing so. The Jira sensor is attached to the Daily Build. If there is no build there is no issue data. Issue data is not connected to the software code like FileMetric or Coverage. We don't know when or when not to run the Jira sensor. Therefore, why not run it all the time? --------------------
- Today, Hongbing, Philip and I were discussing the problem of clobbering snap-shot data in DailyProjectData representations. We concluded that there are two options: (1) only have one autonomous build-guy sending snap-shot data or (2) have a special project setting that allows the specification of the autonomous build-guy. I realize that Philip leans toward option 1. However, here is some more arguments for option 2. Say we have developer Bob who has been working on his own Hackystat extension for years. Never being part of the hacky2004-all project, he has been sending his own FileMetric data and Coverage data to his own account for his own project. Furthermore, Bob is a late night worker. It just so happens that most of this data is sent at 11:55pm. One day, because of his fantastic new extension is granted membership into CSDL and into the hacky2004-all project. After Bob joins the hacky2004-all, we all scratch our heads and wonder why the telemetry data for Coverage and FileMetric is completely wrong. Because, the analyses never show whos data is actually being used this debugging process takes a few hours if not days. We now enforce a new rule, If you join hacky2004-all you must delete all your FileMetric and Coverage Data that have been generated on the developer's local machines.
It seems to me that Data Validity is a top priority. I see too many possibilities for data corruption with the current implementation. Imagine a project with hundreds of contributors. How are we to enforce a rule that they can't have Coverage and FileMetric sensors enabled?
Option 2 is flawless, except if the autonomous build-guy changes. I assume that problem is what is scaring Philip away from this option. Therefore, I've come up with another option. (3) Add an attribute that specifies that the data was measured from a DailyBuild. It specifies the context of its measurement, not just the what tool did the measuring. This can be a simple flag. The DailyProjectData representations should check for this flag and use this as the official snap-shot and disregard any other data. If none exists use the latest runtime (this is what we have been doing). --------------------
- I've haven't seen a Testing Telemetry Scene lately. An interesting scene could be correlating the number of Non-Test Methods, the number of Test-Methods, Coverage, Non-Test Active Time, and Test Active Time. --------------------
- I've been thinking about UnitTest invocation data. In my opinion, the current data is sometimes useless (note that we don't use it on any of the Telemetry Scenes on the Telemetry Viewer). I like UnitTest data and I would like to see it become more useful or maybe just more used.
By the way, would adding a runtime stamp help do some additional analyses? Currently, we don't know how many executions were in a single junitAll.
Here is the primary use of UnitTest data: determining whether developers run Junit Tests.
Here are some other interestng uses: I wonder what code has failed the most in DailyBuilds and even in local builds. I wonder whether the UnitTest's elapsedTime varies for the UnitTest data generated by the DailyBuild. I wonder how many times a DailyBuild junit failure was caused by a developer who didn't run Junit on the code they committed (this would require a lot of processing. it would need commit, active time, dependency, filemetric, and unit test data to make to correlation).
Another interesting useful use of UnitTest data is stated in docbook. It states, "Once collected, UnitTest data can be used to track trends in the success rate of unit tests associated with a system as it is developed, and to investigate potential relationships between test success/failure and other product or process measures, such as size, defects, time, dependencies, build failures, and so forth." An interesting Telemetry Scene could show the number of executed UnitTests, Non-Test active time, Test Active Time, Daily Build Failures. One would think that, if I executed 1500 unit tests (most of them successful) and worked fairly little then no failures should occur. But then again, i suppose the only UnitTest execution that matters is the last set of UnitTest data obtained from one last junitAll before a commit (which is not obtainable without a runtime stamp). And if that is true, then all we really need is a Build entry from a junitAll execution. Wait a minute, I just crossed out a potential useful use of Unit Test data from the telemetry scene. Could it be the case that UnitTest data just isn't suited for Telemetry Streams? --------------------
- Christoph was right. The Coverage SDT does not specify what methods are covered. It just provides how many methods were covered. Is that a future enhancement? --------------------
thanks, aaron
