Sounds great.. Although, I think the sensors should know if they are
counting lines of code from a "master build".
The current problem with our FileMetric data and other SDTs that can be
sent via a build (i.e, UnitTest, Coverage, Dependency) is that it is hard
to determine which snapshot is from the "master build". Currently, our
mechanism is to find that last batch of sensor data for a specific day and
use that set in the DailyProjectData. In CSDL's case, it just so happens
that no one other that hackystat-l runs the loccAll target. So, it works
out pretty good in our environment. In other organizations, multiple
developers could be running loccAlli on different configurations of the
system.
Anyway, this problem would be easily solved if the "snapshot" sensors send
a masterBuild=true property in the propertyList. So, the DailyProjectData
would check for the last batch of sensor data with the
'masterBuild=true'. If there are no batches with a masterBuild=true, then
revert to our old mechanism.
good idea? or not?
thanks, aaron
At 04:40 PM 8/26/2005, you wrote:
Mike, Cedric, Cam, and I met for the Hackystat Size summit today. Here's
a summary of our results:
* When thinking about size metrics in Hackystat, there are a variety of
levels upon which to ponder, including:
- Size tools. Currently or soon-to-be supported tools include:
LOCC (Mike). Java grammar-based parsing, provides 'sophisticated'
size counts and OO metrics.
C++ size support, minimal size data
(comment/noncomment lines)
SCLC (Cedric). Over a dozen languages, minimal size data
(comment/noncomment lines)
Cam's Parser. C++, does minimal size plus number of functions in the
file.
- Size sensors. We currently have a sensor for LOCC only. Goal: a
generic FileMetric sensor.
- FileMetric sensor data type. The current FileMetric Sensor data needs
a redesign. See below.
- Server analyses, including DailyAnalysis, DailyProjectData, and
Reduction functions.
These tend to be too Java-specific. See below.
The delegates to the summit came to the following conclusions:
1. With the advent of evolutionary sensor data types, we will be able to
redesign the FileMetric SDT. Our proposed new structure is:
- Required attributes: - Selected optional (plist) attributes (used by
public analyses):
tstamp sourceLines
tool commentLines
fileName functionCount
fileType functionSizeList
nonblankLines lastMod
totalLines className
This new structure satisfies the following requirements:
- The required attributes provide a minimally useful 'default' size metric
for any kind of file-based source data.
- nonBlankLines can default to totalLines if the counting tool cannot
determine it.
- fileType is determined by the sensor (so that it can, say, use the first
line to detect shell script types)
- Optional attributes support the basic programming language units and
metrics.
- lastMod is useful on the client-side to eliminate redundent FileMetric
entries.
- className is useful in Java for tying Unit Test data without a file but
with a class name to the fileName.
- Other optional attributes can be provided but won't probably be
supported by public Hackystat analyses.
2. We will publish the list of fileType identifiers used by our sensors as
part of the FileMetric SDT documentation so
that other sensor writers can employ them if they desire.
3. Analyses will attempt to avoid hard-coding fileType-specific analyses.
4. Rather than provide a custom sensor for each size tool considered in
this analysis, we instead agreed upon a common XML format that all tools
will produce. Then, a single generic FileMetric sensor can be used to read
in this common XML and send it to the server. The format is basically:
<filemetrics>
<filemetric tstamp="" tool="" fileName="" fileType="" nonblankLines=""
totalLines="" sourceLines="" .../>
<filemetric .../>
<filemetrics>
All of the required attributes should appear in each <filemetric> entry,
and zero or more of the optional ones can appear. Sensors are, of course,
free to implement their own optional attributes, but they will require
additional analysis capabilities to be written.
5. The steps in carrying out this redesign are:
a. Philip finishes evolutionary sensor data types.
b. FileMetric data is evolved to this new format.
c. FileMetric sensor supporting common XML format is implemented.
c. Size tools are modified to produce the common XML format.
d. Analyses are modified to operate on new FileMetric format.
Volunteers for steps (b) through (d) will be solicited following the
conclusion of step (a).
6. As a side note, one delegate to the summit (Cam) complained
vociferously about the bogosity of current UnitTest analyses, which are
totally Java and CSDL specific, and effectively require each unit test to
have an associated FileMetric entry identifying the fileName associated
with the test. The other delegates responded with thundering applause to
this denouncement. Be it resolved: As soon as we clean up size, we are
going to move on to unit testing!
Submitted for your approval.
Philip
Secretary, Hackystat Size Summit
August 2005