Hey Guys,

My progress of installing and using Hackystat has been going better the last couple of weeks.   I've made some hack changes to the Locc sensor (via Ant), made some compromises on what accounts I'm sending data to, and some other things. But, at least I'm beginning to collect data. 

Anyway, this email is just an progress update of my use of Hackystat. It primarily serves as a reminder to me and just a FYI type of thing for future development in Hackystat.

What's working:
- sending FileMetric data via the Ant Locc sensor, but had to exclude specific files from the Ant FileSet because of Java 1.5 parsing errors. So, I'm not getting all the FileMetric data available.
- sending UnitTest data.
- sending Coverage data, via my Emma Sensor.
- sending CodeIssue data, via my Checkstyle Sensor and CodeIssue SDT.
- sending Issue data, via my Jira Sensor

(Note that the UnitTest and Eclipse sensor were the only sensors I could use right out of the box).


Here is a short list of the features that I need in order to use Hackystat in the most effective way:
1) Nested workspace roots (I believe that Hongbing made a fix for this, but I haven't tried it yet)
2) Ant sensors that use the Sensorshell Usermap, so I can send specific project data to specific Hackystat accounts.
3) Build sensor improvements - some way of disabling or specifying which targets can fail the build.  Basically, I don't want checkstyle errors to indicate a build failure.
4) Subversion sensor - I haven't tried this yet. Partially because I'm not to sure I know how to use this sensor correctly.
5) A way to pull reports from Hackystat.  Cedric and I implemented a mechanism to pull Charts from Hackystat, but that requires the use of a Hackystat key and is subject to reaping. 
6) A Snapshot DailyProjectUnitTest implementation.

Also, I've discovered some problems in the Snapshot-style DailyProjectData implementations.  I'll try to explain.

DailyProjectJavaFileMetric uses a module mapping, which allows me to send Java Locc data per module (top level workspace, ie. hackyStdExt, hackyKernel, etc).  However, the problem is that if I ever change my build process to send all module Java Locc data (lets call this per project) at once, the DailyProjectJavaFileMetric will report bogus size values. Thus, the problem is that my knowledge of how DailyProjectJavaFileMetric alters the way I set up my FileMetric sensor. Basically, I have to follow the Ant build process used for Hackystat for my DailyProjectJavaFileMetric to work correctly. 

By the way, I thought that this was totally bogus. And is a good example of the CSDL specific implementations.

    boolean bcmlFilter = ServerProperties.getInstance().getProperty( "BcmlFilemetricFilter" , true );
.... <snip code>  ....
    if ((!bcmlFilter || "LOCC" .equals(entry.getTool())) && this .isAcceptedFileExtension(entry)) {

Instead, we should have implemented a ToolFilemetricFilter and set that to "BCML", which supports comma separated values.   And eliminate the "LOCC".equals(entry.getTool()) check.


To contrast the problems with DailyProjectJavaFileMetric, the DailyProjectCoverage must process data per project, because it doesn't use a module mapping and only uses a runtime mapping.  Thus, if I send Coverage data per module, each snapshot will override the previous runtime and I end up with a DailyProjectCoverage that will only provide coverage results from one module. Again, the implementation of the DailyProjectCoverage dictates the way I setup my Coverage sensor (which is the Hackystat Ant build process way). 

In addition to these specific problems, you should see that DailyProjectJavaFileMetric and DailyProjectCoverage handles snapshot data exactly the opposite of one another.  So, basically, I have to send FileMetric data per module and I have to send Coverage data per project. That isn't very intuitive.

Furthermore, this should illustrate that each of the Snapshot DailyProjectData implementations are drastically different.  For example, a while ago I implemented DailyProjectDependency (also snapshot DailyProjectData implementation) and modeled it after DailyProjectCoverage.  However, when I implemented DailyProjectCodeIssue (also snapshot DailyProjectData implementation), I modeled it after DailyProjectJavaFileMetric.  So, you can see that the two problems I've identified above also got propagated to other DailyProjectData implementations.

I believe that the DailyProjectData implementations for Snapshot data should be more generalized some how.  There should be one class that determines which set of data the DailyProjectData should use, instead of each instance figuring it for themselves.  It also should support sending data via module (ie DailyProjectJavaFileMetric style) __and__ sending via project (DailyProjectCoverage style).  Again, the sensors shouldn't affect the way that these Project level functions behave.  I believe that adding an attribute to the SDT to distinguish master builds from developer builds will also help, but isn't critical to the improvements I suggested.  I wrote a html page that talks about this at < http://hackystat.org/hackyDevSite/doc/SnapshotEnhancements.html>, but that was a while ago and does not reflect the new eSDTs. 

Lastly, our implementation of the DailyProjectUnitTest is aggregate for Hackystat developers, but is also Snapshot for the hackystat-dev-l user (because we have a nightly build as opposed to a hourly build).  If an analysis needs to get a snapshot of the unit tests it would need to know the "daily build user" and get those specific values and hope that hackystat-dev-l only sent one bactch of unit test data for that day.


There are probably a lot of other CSDL specific functionality that has crept into the code base.  But, that is perfectly OK, since we have been the primary users thus far.  I just think that at some point these improvements need to be addressed.  I'd be happy to slowly chip away at these issues, but the only thing is that I do Hackystat hacking after work hours.


thanks, aaron

Reply via email to