Lorin writes:

Whenever you ask me what analysis I'd like in Hackystat, I'm never quite
sure, because much of the analysis I have done so far is exploratory, and
Hackystat just isn't the platform I envision for doing that (I use "R",
which is like an open-source, Matlab-like language/environment:
www.r-project.org.). I think what's been throwing me is my view of the
word "analysis".

What I would really like out of Hackystat is an easy way to track all of
the students in a class. I think Hackystat is great, but I'm clearly
trying to shoehorn it into a task it wasn't designed for. It thinks in
"projects", where multiple programmers are working on the same source
files. What I want to work with is the concept of an "assignment" (or,
even better, "experiment"), where the students are working independently,
on the same task. I'd like to see graphs that show me data for all of the
students at once, to see how much active time each student has spent so
far.

I imagine this would require some significant additional functionality on
the server side. Actually, we're getting a couple of German undergrads to
come over and work with us for a couple of months (I think around March),
and the HPCS project gets one of them. This might be a good project for
that student. There's also Mike Paulding, if this fits into his research
interest.

If the Vanderbilt people advance with their HPC plugin, and we can start
capturing data on execution time and program correctness, then I might
have some more ideas for some HPCS-specific Hackystat analysis. (Maybe
the Vanderbilt folks would be interested in implementing this
functionality...).

An Eclipse-based-Hackystat-HPCS-experimental-development-environment
would be a wonderful thing.

Hi Lorin,

I'm cc'ing the hackystat-dev-l list on this response because I think your
ideas are really great.

I completely agree that Hackystat in its current incarnation does not
fulfill its potential as experimental infrastructure, and I think that you
suggest two really excellent ideas for increasing its usability.

First, I don't think the problem is the Project representation (which is
useful in an experimental context for delineating which sensor data, over
what interval, should be used for analysis).  The problem is that we do not
provide any higher level aggregation on top of Projects that says "Let's
look at the data from a set of users (specifying a single Project for
each)".  We, in fact, already have a module (hackyCourse) that provides
most of what you need. The limitation is that it implements a very narrow
range of analyses over the set of users.  The hackyCourse module preceded
our telemetry infrastructure, and what would be very cool would be to
enhance hackyCourse into a new module (hackyExperiment?) that brings the
power of our telemetry infrastructure to the multiple user analyses
provided by hackyCourse.

Second, I agree with your preference for R for certain kinds of
exploration.  What I would love to see is tighter integration between R and
hackystat. (hackyR?)  What I'm envisioning is a way to directly connect the
telemetry reduction functions (and/or the daily analysis functions) to R.
This adds a lot of usability by enabling you to leverage the 'data
cleaning' aspects that the reduction functions and/or daily analyses
provide.  For example, it turns out that extracting size information about
a system takes a little bit of post processing from the sensor data---you
need among other things to make sure that if you send size sensor data
twice during a day, you don't double count as a result.  If you just feed
the raw sensor data into R, you have to discover and deal with these issues
in R.  On the other hand, if we can create a way to interface to R from our
higher level representations for data, then you're one step closer to what
you need. (And, looked at it another way, it creates the possibility of you
writing new reduction functions to clean the raw data appropriately for
your purposes, which makes new telemetry streams available to us, so the
benefits go both ways.)

Creating the hackyExperiment and hackyR modules might be an excellent goal
for Mike and you to work on this summer.  Sounds like no more than a long
weekend or two of hacking to me. :-)

Cheers,
Philip

Reply via email to