[HACKYSTAT-DEV-L] Size Metrics in Hackystat [was: CCCC sensor upgrade?]

Philip Johnson Thu, 29 Sep 2005 10:58:53 -0700

--On Wednesday, September 28, 2005 8:37 PM -1000 Mike Paulding<[EMAIL PROTECTED]> wrote:

Hi all,


The CCCC sensor has now been upgraded to be compatible with eSDTs,
however, it will only work with the version of CCCC that we distribute in
our help pages.  The version in our help pages has an XML outputter that
was written by Aaron because, at the time, CCCC did not support XML
output.

There is now a more recent version of CCCC that produces XML, but it is
not the same format as ours (actually, ours is cleaner).  Therefore, if
we want to continue to support CCCC, I suggest we rewrite the sensor to
parse the "new" XML format from the latest version of CCCC.  We don't
have to worry about frequent maintenance of our sensor, however, because
CCCC is a graduate research project that is now completed and no longer
developed.

I guess the bottom line is:  Should we continue to support the Hackystat
CCCC version (as we do now) or update to the latest (and terminal)
version of CCCC?

Best regards,
Mike


Hi Mike,

A very thought-provoking question.  Here's my reaction:

(a) Having Java size data has been quite useful to us in the Hackystatproject, but even the simplest form of size metric (i.e. LOC) has beenproven to be as good as any other that we've tracked. Our telemetry streamsdemonstrate that LOC, Methods, and Classes co-vary almost perfectly, sothere's no added information from the other size representations.

(b) One thing we haven't been able to do yet is look at non-java sizeeffectively. For example, in Hackystat, I think it would be interestingand useful to track .jsp size and .xml size as well as .java size.(Indeed, one interesting telemetry stream in future will be *.java size vs.*.docbook.xml size, which will show how documentation changes with codesize).

(c) While we've done a reasonable job of supporting multiple editors 'outof the box', we've not yet provided the same ease of use for artifact size.Providing 'out of the box' support (and by support I mean all of sensor,sensor data type, daily project analyses, reduction functions, etc) foreven a simple size measure such as LOC for most/all artifact types will bea significant improvement for Hackystat's built-in facilities. The SizeSummit has laid the groundwork for what needs to be done initially:

<http://www.mail-archive.com/[email protected]/msg01084.html>

These thoughts indicate to me that a reasonable direction is as follows:

(a) Continue to support LOCC and CCCC, but in 'maintenance mode'. Do notadd any new functionality to them for the time being.

(b) Migrate to SCLC for tracking size of all artifact types in theHackystat project. This means doing all of the size summit work, thenchanging hackydev to run SCLC instead of LOCC. One thing this will requireus to do is evolve the UnitTest SDT to require a fileName attribute thatcan be used to identify the workspace/project associated with a test case.That will complicate our JUnit sensor a bit, but simplify things immenselyon the analysis side (no more need for ClassWorkspaceMaps for everyconceivable artifact type).

When this is done, then the Hackystat project itself will serve as anexample of how to collect and analyze multiple artifact types. Indeed, onevery interesting reduction function would be one that takes a wildcard suchas "*" for file type, and which produces as a result a separate stream forthe total current size of every type of file (based on suffix) found in thedirectory tree processed. Off the top of my head, for Hackystat, thatwould include at least *.java, *.xml, *.jsp, *.html, *.el, *.cc, and *.cs.This would be an extremely useful 'out of the box' telemetry stream, whichworks best in concert with a 'generic' size counting tool such as SCLC.

In the future, we might want to return to tools like LOCC and CCCC when wedecide we want to investigate more sophisticated size metrics on alanguage-specific basis. For the immediate future, however, my thought isthat the highest priority is to provide simple, usable support forcollecting and analyzing LOC-based size for a wide variety of languages.

In terms of project planning, I would see this effort as starting soonafter 7.0 is released, so this might be scheduled for 7.1 or 7.2.


What do people think about this?

Cheers,
Philip

[HACKYSTAT-DEV-L] Size Metrics in Hackystat [was: CCCC sensor upgrade?]

Reply via email to