First off, thanks Aaron for cc'ing hackystat-dev-l on your reply.  If a
question/issue seems interesting to the larger hackystat community, please
be sure to post it to hackystat-dev-l, not csdl-local-l.  (All csdl-local-l
folks are also on hackystat-dev-l.)

I think dependency (i.e. coupling) has a great deal of potential for
telemetry (and other kinds of analyses) in Hackystat, because it can
provide a different and potentially very useful proxy for 'complexity'
(which has the potential for correlation with maintenance costs,
comprehension costs, defects, etc.)

We already collect raw data on coupling, and the question is how to analyze
it effectively.  A couple of ideas:

(a) The analysis shouldn't simply mirror the size metric.  For example, if
we simply count the total number of dependencies in a system, that will
probably correlate quite well to size.  If that's the case, then we don't
learn anything new about the system.

(b) Certain kinds of coupling probably aren't that interesting with respect
to assessing complexity.  For example, if a private method inside a class
calls another private method of that class, that's probably not indicative
of complexity, since it should be relatively easy for a developer to
'understand' that coupling.  On the other hand, a class with methods that
call other methods from a variety of external packages is more 'complex',
in the sense that one has to understand more of the system in order to
understand that class.  So, rather than simply count the total number of
couplings, what I think would be interesting to do is characterize the kind
of coupling and create a measure that reflects in some sense the amount of
external context that a developer needs to understand in order to
understand the class/package in question.  As a simple example, consider
two situations:

- class foo.bar.A calls 10 external methods, all of which are in class
foo.bar.B.
- class baz.qux.C calls 10 external methods, each of which are in a
different external package.

While the total number of external couplings are the same in each case
(10), I would claim that class C is going to require more effort to
understand than A, that changes to C are more likely to introduce errors
than changes to A, and that C is more likely to break when other parts of
the system change than A.

(c) It's always easy to make claims about what a metric will do. But what
we also need to do is _validate_ a proposed metric--figure out whether and
in what sense it is measuring what we want it to measure.  So, when
designing an analysis based upon dependency/coupling, it is also important
to think about how we will evaluate that analysis.  In the case of an
analysis based upon complexity, one idea might be to perform a study where
we create an analysis that creates a measure of complexity for a given
system unit (i.e. class or package), and then compare that to the
cumulative amount of Active Time spent on that same unit.  If we see a
correlation, in that historically over time we spend more time on
classes/packages that rank higher in complexity, that validates the metric.
(Note that this demonstrates a relationship between a product metric
(dependency) and a process metric (Active Time)).

(d) Once we've validated the metric, telemetry starts to be very useful, by
showing us places in the system where complexity is increasing or
decreasing, which in turn predict areas of increasing/decreasing
maintenance/understanding costs.

Comments?

Philip





--On Thursday, July 21, 2005 9:39 AM -0700 Tim Shadel <[EMAIL PROTECTED]>
wrote:

[Forgot Reply-All again...]

You're probably aware of these dependency analysis tools, and the
paper by Robert Martin.  I've collected a set of dependency links I've
come across in my Furl archive.  You can see them here if you're
interested:  http://furl.net/members/shadeltd/dependency%20analysis

--Tim

On 7/21/05, Aaron Kagawa <[EMAIL PROTECTED]> wrote:
Hey Christoph,

I think a dependency reducer would be interesting. But, I'm not sure
about the exact calculations of the reducer. For example, I'm not sure
how to interpret the average dependencies.  I suppose the best thing to
do is to give it a try; besides it shouldn't take that much development
effort.

One calculation that might be interesting is something similar to the
issue reducer that calculates the delta of outgoing or incoming
dependencies.  Or maybe not, because plotting the sum of
incoming/outgoing overtime should allow us to see the changes.  Anyway,
I think the best thing to do, is to use a GQM type strategy in
determining what reducers would be most useful in answering a Question.

BTW, I've been wanting to add a Dependency entry into the
DailyProjectDetailAnalysis for a while now
<http://hackydev.ics.hawaii.edu:8080/browse/HACK-251>, but wasn't quite
sure what data it should provide.  For example, is the sum of all
dependencies actually interesting?

thanks, aaron

At 10:44 PM 7/20/2005, you wrote:
> Hi guys!
>
> What do you think about having a telemtry reduction function which can
> plot stuff like 'average dependencies per class/method/package', 'sum
> of ingoing' , 'sum of outgoing' or  'sum of all dependencies',
> filtered by workspaces?
>
> Would this be useful?
>
> Cheers,
> Christoph

Reply via email to