Re: [HACKYSTAT-DEV-L] Various Comments about Hackystat-Stuff

Philip Johnson Fri, 04 Mar 2005 19:16:53 -0800

- In a previous email, I suggested that we should be running the Daily
Build everyday. Instead, of checking to see if code changed and then
running the build. I think my reasons in the previous email are justified.
Anyway, I just figured out another reason why we should be doing so. The
Jira sensor is attached to the Daily Build. If there is no build there is
no issue data. Issue data is not connected to the software code like
FileMetric or Coverage. We don't know when or when not to run the Jira
sensor. Therefore, why not run it all the time?


True, although fortunately an occasional dropout in the Jira snapshot is 
not fatal.  The
most interesting analyses we've come up (closure rates) are at the grain size 
of weeks or
months anyway.

- Today, Hongbing, Philip and I were discussing the problem of clobbering
snap-shot data in DailyProjectData representations. We concluded that there
are two options: (1) only have one autonomous build-guy sending snap-shot
data or (2) have a special project setting that allows the specification of
the autonomous build-guy. I realize that Philip leans toward option 1.
However, here is some more arguments for option 2. Say we have developer
Bob who has been working on his own Hackystat extension for years. Never
being part of the hacky2004-all project, he has been sending his own
FileMetric data and Coverage data to his own account for his own project.
Furthermore, Bob is a late night worker. It just so happens that most of
this data is sent at 11:55pm. One day, because of his fantastic new
extension is granted membership into CSDL and into the hacky2004-all
project. After Bob joins the hacky2004-all, we all scratch our heads and
wonder why the telemetry data for Coverage and FileMetric is completely
wrong. Because, the analyses never show whos data is actually being used
this debugging process takes a few hours if not days. We now enforce a new
rule, If you join hacky2004-all you must delete all your FileMetric and
Coverage Data that have been generated on the developer's local machines.


No, the new rule is "Never allow a developer named Bob to join the Hackystat 
project!"
:-) :-)

OK, more seriously, yes, if the situation you describe actually arose, we would 
have a
sudden, bizarre drop in recorded size data for Hackystat.   And yes, this does 
argue
convincingly that it should be possible to 'designate' the project member to be 
used for
analyses.

Option 2 is flawless, except if the autonomous build-guy changes. I assume
that problem is what is scaring Philip away from this option. Therefore,
I've come up with another option. (3) Add an attribute that specifies that
the data was measured from a DailyBuild. It specifies the context of its
measurement, not just the what tool did the measuring. This can be a simple
flag. The DailyProjectData representations should check for this flag and
use this as the official snap-shot and disregard any other data. If none
exists use the latest runtime (this is what we have been doing).


That's actually a very cool approach: instead of specifying a distinguished 
user, we can
attach a distinguished tag to the data!

This issue and your approach are now recorded as:
<http://hackydev.ics.hawaii.edu:8080/browse/HACK-204>

- I've haven't seen a Testing Telemetry Scene lately. An interesting scene
could be correlating the number of Non-Test Methods, the number of
Test-Methods, Coverage, Non-Test Active Time, and Test Active Time.


Yes. Actually, I do have a telemetry chart for 413/613 with that info.

- I've been thinking about UnitTest invocation data. In my opinion, the
current data is sometimes useless (note that we don't use it on any of the
Telemetry Scenes on the Telemetry Viewer). I like UnitTest data and I would
like to see it become more useful or maybe just more used.

By the way, would adding a runtime stamp help do some additional analyses?
Currently, we don't know how many executions were in a single junitAll.

Here is the primary use of UnitTest data: determining whether developers
run Junit Tests.

Here are some other interestng uses: I wonder what code has failed the most
in DailyBuilds and even in local builds.


Very interesting.  Imagine a bar chart where the X axis is junit tests and 
the Y axis is
the number of times they failed.  Order them by frequency.  Show the developer 
the top 20
(40? etc.) unit tests that failed the most number of times for the given 
project over the
given time interval.

Saved as:
<http://hackydev.ics.hawaii.edu:8080/browse/HACK-205>

I wonder how many times a DailyBuild junit failure was caused by a developer
who didn't run Junit on the code they committed (this would require a lot
of processing. it would need commit, active time, dependency, filemetric,
and unit test data to make to correlation).


I think that is exactly what Cedric is going to support in his build 
failure analysis.

- Christoph was right. The Coverage SDT does not specify what methods are
covered. It just provides how many methods were covered. Is that a future
enhancement?


I just chatted with Christoph about this. It doesn't look like we 
necessarily need to
know the specific methods, just knowing coverage at the class level might be 
enough.

The basic idea is to be able to make an assessment about what kind (and whether 
or not)
people are doing quality assurance as they develop new code.  This leads to an 
analysis
with the following characteristics:

(a) See what classes are committed by a developer during a day.
(b) See how much active time they spent on those classes, pick out the ones with
significant effort.
(c) See what the coverage is associated with those classes.
(d) Use dependency data to see if there are JUnit classes that invoke the 'new' 
classes.

From this, we should be able to generate a 'profile' of new code development.


Saved as:
<http://hackydev.ics.hawaii.edu:8080/browse/HACK-206>

Cheers,
Philip

Re: [HACKYSTAT-DEV-L] Various Comments about Hackystat-Stuff

Reply via email to