Ok.. I actually didn't mean total time interacting with an IDE. I meant IDE interaction with a file. And I'm pretty sure that we didn't intend for it to be a effort proxy.
Here I attempt to defend the initial idea. :).
The first to concerns are Hackystat limitations. The second two are research implication limitations.
(a) We all ready capture Buffer transitions, Open, Close, Build, etc that capture Activities with a File. By the way, I tried to find out the other activities that are supported and I couldn't find hat information in our documentation.
(b) Hopefully, we can sense all the activities in (a) with all IDEs. To my knowledge we can do that for Emacs, JBuilder, and Eclipse. Ah, but we are already too far into Eclipse. For example, I'm pretty sure that Hongbing's Development Streams won't really work best with the current Emacs or JBuilder sensor. So, when we reach out to the TDD community we'll probably have to say that they have to use Eclipse and the Eclipse sensor. Which I think is reasonable for this thesis research. Not to forget about Jupiter's reliance on Eclipse.
(c) "IDE interaction with a File" time is not a proxy for effort it is simply the time you spend interacting with a file other than state changes.
(d) I don't think "IDE interaction with a File" time was intended to be appended with Active Time or replace it. Rather, it helps explain the events leading up to your Active Time. I agree, that the most reliable Proxy for Effort is State Change Active Time. But, on the other hand, other "activities" times could be interesting.
When Burt mentioned this idea, I wasn't thinking about it as a Effort Proxy. Instead, I was thinking of it as an explanation of my Active Time. So, lets assume the four problems are magically solved. Then it would be cool to know that, you spent more time using the debugger than writing code. Or that you spent more time browsing code than editing. Philip, mentioned long ago that someone at wanted to know about their Buffer Transitions in a IDE, because it could signify some sort of "knowledge coupling". Would it be so bad to say, "you spent 5 minutes in buff transitioning before editing code"? Maybe the length of time of your buffer transitions also signify a severity of the "knowledge coupling".
We already associate time with IDE events with a file in Hongbing's Development Stream analysis. In fact, it is almost the case that the non-editing micro-level activities are more important than the actual State Changes. I think it would be a great thing to know that even though my Episodes took 5 hours, I only had 10 minutes of Active Time. Again, Episode Time isn't a proxy of effort. Its just another way of looking at your development time.
Here is a real life example. Take a look at the attached file. This file contains my Daily Diary from some day.. You'll notice that I have more (no files edited) than Most Active Files. What the heck was I doing? According to this, there's times where I just don't edit anything, instead it seems that i interacted with files for a long period of time. I hope that I don't always work like this. But, I'll never be able to know if I do this all the time, if we can't associated a time metric to (no files edited). Wouldn't it be crazy, if I actually spend more time opening, closing, buff transitioning, and using the debugger, than actually editing code? Now, I really want to know my "IDE Interaction with a File without editing it" time.
To conclude, I think that this idea of other "Activity" Times has merit, but not as a proxy for effort. I agree that there are few Hackysat problems and research type problems that need to be solved before this "activity" time would become reliable and useful.
thanks, aaron
At 07:00 AM 3/31/2005, you wrote:
Aaron and I were discussing a new possible metric that would capture the total time that a person spends interacting with an IDE.
I've said it before, and I'll say it again: I have no problem with dreaming up alternative proxies for effort. However, when we discuss the design of a new proxy, we need to keep in mind the following:
(a) One of the key foundations of hackystat analyses is that data can be attached to a Project. For data to be attached to a Project, the data (normally) has to be associated with a File. Data that is not associated with a File, or is not associated with a File associated with a Project but which should nonetheless be counted as 'time spent on that Project' will require some other kind of mechanism to make the association. That processing should be reliable, in the sense that you don't inadvertantly over-estimate or mis-represent the time spent on that Project. If a new proxy introduces large errors and variances due to "heuristic" attachment of time to Projects, that's not much of a win.
(b) Although Eclipse is the Java IDE du jour, many developers use other IDEs. CSDL built a very large collaborative environment called Egret (~100 KLOC) in the early 1990's using the state of the art editor at that time (Emacs). It was very cool, and very powerful, and what killed that project, as much as anything else, was its reliance of the facilities available only in a single IDE. I don't want to repeat that mistake again by evolving everything toward Eclipse-specific features such that our answer to everyone who wants to use Hackystat is, "OK, fine, just switch to Eclipse first and look at all the neat things you can do".
(c) Even if we could accurately measure every single second spent in the IDE, we would still not be close to measuring overall "effort" on a Project. I spend a _lot_ of my day in conversations with you all, in meetings, reading papers, in teleconferences, and so forth related to Hackystat and Hackystat-related projects. None of that time is being captured, and wouldn't be with an alternative proxy.
(d) How will we use this alternative proxy, and what will it provide analytically that Active Time doesn't? I'm not saying that no alternative effort proxy can provide anything in addition to Active Time. What I'm saying is that if you start with the goal and how it will be different from the way we currently use Active Time, then it makes the discussion of the measure more fruitful. Currently, we use Active Time principally to measure regularity and consistency and to obtain baselines. Whether the baseline number is 12 or 37 is not that meaningful to us. If we substitute a new measure, and find that the baseline number is now 15 or 62, but are still measuring only regularity and consistency, have we actually gained anything?
Cheers, Philip
DailyDiary.xls
Description: Binary data
