--On Wednesday, September 28, 2005 8:37 PM -1000 Mike Paulding <[EMAIL PROTECTED]> wrote:

Hi all,

The CCCC sensor has now been upgraded to be compatible with eSDTs,
however, it will only work with the version of CCCC that we distribute in
our help pages.  The version in our help pages has an XML outputter that
was written by Aaron because, at the time, CCCC did not support XML
output.

There is now a more recent version of CCCC that produces XML, but it is
not the same format as ours (actually, ours is cleaner).  Therefore, if
we want to continue to support CCCC, I suggest we rewrite the sensor to
parse the "new" XML format from the latest version of CCCC.  We don't
have to worry about frequent maintenance of our sensor, however, because
CCCC is a graduate research project that is now completed and no longer
developed.

I guess the bottom line is:  Should we continue to support the Hackystat
CCCC version (as we do now) or update to the latest (and terminal)
version of CCCC?

Best regards,
Mike

Hi Mike,

A very thought-provoking question.  Here's my reaction:

(a) Having Java size data has been quite useful to us in the Hackystat project, but even the simplest form of size metric (i.e. LOC) has been proven to be as good as any other that we've tracked. Our telemetry streams demonstrate that LOC, Methods, and Classes co-vary almost perfectly, so there's no added information from the other size representations.

(b) One thing we haven't been able to do yet is look at non-java size effectively. For example, in Hackystat, I think it would be interesting and useful to track .jsp size and .xml size as well as .java size. (Indeed, one interesting telemetry stream in future will be *.java size vs. *.docbook.xml size, which will show how documentation changes with code size).

(c) While we've done a reasonable job of supporting multiple editors 'out of the box', we've not yet provided the same ease of use for artifact size. Providing 'out of the box' support (and by support I mean all of sensor, sensor data type, daily project analyses, reduction functions, etc) for even a simple size measure such as LOC for most/all artifact types will be a significant improvement for Hackystat's built-in facilities. The Size Summit has laid the groundwork for what needs to be done initially:
<http://www.mail-archive.com/[email protected]/msg01084.html>

These thoughts indicate to me that a reasonable direction is as follows:

(a) Continue to support LOCC and CCCC, but in 'maintenance mode'. Do not add any new functionality to them for the time being.

(b) Migrate to SCLC for tracking size of all artifact types in the Hackystat project. This means doing all of the size summit work, then changing hackydev to run SCLC instead of LOCC. One thing this will require us to do is evolve the UnitTest SDT to require a fileName attribute that can be used to identify the workspace/project associated with a test case. That will complicate our JUnit sensor a bit, but simplify things immensely on the analysis side (no more need for ClassWorkspaceMaps for every conceivable artifact type).

When this is done, then the Hackystat project itself will serve as an example of how to collect and analyze multiple artifact types. Indeed, one very interesting reduction function would be one that takes a wildcard such as "*" for file type, and which produces as a result a separate stream for the total current size of every type of file (based on suffix) found in the directory tree processed. Off the top of my head, for Hackystat, that would include at least *.java, *.xml, *.jsp, *.html, *.el, *.cc, and *.cs. This would be an extremely useful 'out of the box' telemetry stream, which works best in concert with a 'generic' size counting tool such as SCLC.

In the future, we might want to return to tools like LOCC and CCCC when we decide we want to investigate more sophisticated size metrics on a language-specific basis. For the immediate future, however, my thought is that the highest priority is to provide simple, usable support for collecting and analyzing LOC-based size for a wide variety of languages.

In terms of project planning, I would see this effort as starting soon after 7.0 is released, so this might be scheduled for 7.1 or 7.2.

What do people think about this?

Cheers,
Philip

Reply via email to