[HACKYSTAT-DEV-L] Various Comments about Hackystat-Stuff

Aaron Kagawa Thu, 03 Mar 2005 05:21:41 -0800

Hey Guys,

This email contains comments on a wide variety of things. It is at least
ordered by "severity": read on if you dare.
by the way.. if any of these ideas are good, i'm not volunteering for all
of them... :)

- In a previous email, I suggested that we should be running the Daily
Build everyday. Instead, of checking to see if code changed and then
running the build. I think my reasons in the previous email are justified.
Anyway, I just figured out another reason why we should be doing so. The
Jira sensor is attached to the Daily Build. If there is no build there is
no issue data. Issue data is not connected to the software code like
FileMetric or Coverage. We don't know when or when not to run the Jira
sensor. Therefore, why not run it all the time?
--------------------

- Today, Hongbing, Philip and I were discussing the problem of clobbering
snap-shot data in DailyProjectData representations. We concluded that there
are two options: (1) only have one autonomous build-guy sending snap-shot
data or (2) have a special project setting that allows the specification of
the autonomous build-guy. I realize that Philip leans toward option 1.
However, here is some more arguments for option 2. Say we have developer
Bob who has been working on his own Hackystat extension for years. Never
being part of the hacky2004-all project, he has been sending his own
FileMetric data and Coverage data to his own account for his own project.
Furthermore, Bob is a late night worker. It just so happens that most of
this data is sent at 11:55pm. One day, because of his fantastic new
extension is granted membership into CSDL and into the hacky2004-all
project. After Bob joins the hacky2004-all, we all scratch our heads and
wonder why the telemetry data for Coverage and FileMetric is completely
wrong. Because, the analyses never show whos data is actually being used
this debugging process takes a few hours if not days. We now enforce a new
rule, If you join hacky2004-all you must delete all your FileMetric and
Coverage Data that have been generated on the developer's local machines.

It seems to me that Data Validity is a top priority. I see too many
possibilities for data corruption with the current implementation. Imagine
a project with hundreds of contributors. How are we to enforce a rule that
they can't have Coverage and FileMetric sensors enabled?

Option 2 is flawless, except if the autonomous build-guy changes. I assume
that problem is what is scaring Philip away from this option. Therefore,
I've come up with another option. (3) Add an attribute that specifies that
the data was measured from a DailyBuild. It specifies the context of its
measurement, not just the what tool did the measuring. This can be a simple
flag. The DailyProjectData representations should check for this flag and
use this as the official snap-shot and disregard any other data. If none
exists use the latest runtime (this is what we have been doing).
--------------------

- I've haven't seen a Testing Telemetry Scene lately. An interesting scene
could be correlating the number of Non-Test Methods, the number of
Test-Methods, Coverage, Non-Test Active Time, and Test Active Time.
--------------------

- I've been thinking about UnitTest invocation data. In my opinion, the
current data is sometimes useless (note that we don't use it on any of the
Telemetry Scenes on the Telemetry Viewer). I like UnitTest data and I would
like to see it become more useful or maybe just more used.

By the way, would adding a runtime stamp help do some additional analyses?
Currently, we don't know how many executions were in a single junitAll.

Here is the primary use of UnitTest data: determining whether developers
run Junit Tests.

Here are some other interestng uses: I wonder what code has failed the most
in DailyBuilds and even in local builds. I wonder whether the UnitTest's
elapsedTime varies for the UnitTest data generated by the DailyBuild. I
wonder how many times a DailyBuild junit failure was caused by a developer
who didn't run Junit on the code they committed (this would require a lot
of processing. it would need commit, active time, dependency, filemetric,
and unit test data to make to correlation).

Another interesting useful use of UnitTest data is stated in docbook. It
states, "Once collected, UnitTest data can be used to track trends in the
success rate of unit tests associated with a system as it is developed, and
to investigate potential relationships between test success/failure and
other product or process measures, such as size, defects, time,
dependencies, build failures, and so forth."  An interesting Telemetry
Scene could show the number of executed UnitTests, Non-Test active time,
Test Active Time, Daily Build Failures. One would think that, if I executed
1500 unit tests (most of them successful) and worked fairly little then no
failures should occur. But then again, i suppose the only UnitTest
execution that matters is the last set of UnitTest data obtained from one
last junitAll before a commit (which is not obtainable without a runtime
stamp). And if that is true, then all we really need is a Build entry from
a junitAll execution. Wait a minute, I just crossed out a potential useful
use of Unit Test data from the telemetry scene. Could it be the case that
UnitTest data just isn't suited for Telemetry Streams?
--------------------

- Christoph was right. The Coverage SDT does not specify what methods are
covered. It just provides how many methods were covered. Is that a future
enhancement?
--------------------

thanks, aaron

[HACKYSTAT-DEV-L] Various Comments about Hackystat-Stuff

Reply via email to