Recently a researcher sent me the following question. I've been asked this before, and so I think the answer might be of interest to the general Hackystat user and developer community. Plus, if you think differently, I'd like to know about it!
Here's the question, slightly edited: > I recently came across HackyStat while > looking for ways to collect data from Eclipse IDE and SVN. > I need a way to grab the log of Eclipse activities such as > users moving, renaming or deleting files. Furthermore, I need a way to > grab log from SVN such as those involving check-in, check-out, commit, > merge conflicts, and so on. I downloaded and installed HackyStat with the > Eclipse sensor to try it out, and it seems to be doing just that. > However, I am wondering if there is a way for me to use the data that is > currently uploaded to the public server directly? As in, during the > process of sending them to the server, I would like to parse those data > for use of the research project locally. Is that possible? My answer was as follows: Greetings, Whether or not Hackystat is appropriate for what you want to do really depends on what you're trying to do. That may seem a little circular, so let me explain a few of the design decisions in Hackystat and how they might impact on its suitability for your research. First, Hackystat sends data from its client sensors to the server using a pretty straightforward SOAP protocol, and all client-side data goes through a common middleware component (called the "SensorShell"). So, from this perspective, it would indeed be possible to get the data "directly". However, the Hackystat sensor for SVN is a 'polling' sensor rather than an 'interrupt' style sensor. In other words, it is implemented as a program that queries the SVN server once a day, gathers data about the SVN commits, and sends that to the Hackystat server (with the timestamps associated with the actual time of the commit). Right now we don't actually gather any data other than commits, and I don't think our current sensor could even get at things like merge conflicts, which would probably require a sensor attached directly to the SVN client. We could write a client SVN sensor for Hackystat, of course, but we haven't yet. Another issue is the immediacy with which data is available for processing. Hackystat is designed around a "non hard real time" assumption. In other words, our analyses generally assume that complete data about the software development process may not be available for minutes or even hours after the events in question has occurred. This allows us to do things like buffer events on the client side and send them to the server as a bundle every 10 minutes, or allow people to work on airplanes and have their data sent to the server after they land and reconnect to the net, or allow sensors to be designed like the SVN sensor, which runs only once a day and "fills in" configuration management data about the previous day. Another issue has to do with collaboration. Hackystat is intended to allow modeling and analysis of groups of people collaborating together in various ways with various tools at various times. So, we have explicit representations for "projects" involving "members", and our representations enable us to know when two people worked on the same file during a project, even if one person was on a Windows machine and the file was named c:\mysvn\foo\bar.java and another person was on a Mac and the file was named /users/johnson/svnstuff/foo/bar.java. What all of this adds up to is the following: if, for example, what you want to do is to provide 'hard' real-time feedback (i.e. feedback within no more than a few seconds after the precipitating event) to a single programmer doing work only with SVN and Eclipse, then you are probably better off rolling your own mechanism (maybe borrowing code from our sensor implementation to get you started.) If, however, you don't really need hard real-time feedback, and you want to 'scale' your application (to support groups of collaborating developers and/or events associated with many different kinds of development tools), then some kind of Hackystat-based enhancement might make sense. That's about the best I can do without more details on your project. Please feel free to send me email with more details if you would like. Cheers, Philip
