[HACKYSTAT-DEV-L] Command Line Sensor/NG (design proposal)

Philip Johnson Mon, 19 Dec 2005 14:59:31 -0800

Greetings Andy, Lorin, Patrick, and the hackystat dev crew,

I am writing to summarize recent work on command line sensor data collection and analysisand to propose a future direction.

The background: understanding HPC software development involves gaining insight into thebehaviors and outcomes of developer activities, which in this domain involves a lot of(unix) command line interaction. In previous work, we created a command line sensor thatutilizes the "history" mechanism to obtain timestamped data consisting of the developer'scommands. This sensor was quite easy to write, but has one substantial shortcoming: itdoes not obtain any information about the results of the command invoked.

In the "next generation" command line sensor, we would like to capture both (a) whatcommand was entered and at what time, and (b) what the results of the command were (andat what time these results appeared).

To satisfy these requirements, we would like to propose the following designdecomposition into three tools:

[1] A CLI logging mechanism. This tool would be similar, but not identical, to the"script" command in Unix. The difference is that 'script' does not provide timestamps oneach command, the command results, or the current working directory associated with thecommand. The user could invoke the logging mechanism similar to the way the scriptcommand is invoked, and our tool would create a logging file (preferably, but notnecessarily, in XML format) similar to the following:


<cli-logger>

<cli-logger-entry>

<invocation time="1134730020809" current-directory="/user/home/johnson/"machine="bertha.ics.hawaii.edu">

ls -la
</invocation>
<result time="11347300245673">
drwxr-xr-x  36 johnson  csdl         4096 Dec 15 16:13 ./
drwxr-xr-x  40 root     root         1024 Feb 11  2005 ../
-rw-------   1 root     other          76 Feb  3  2001 .TTauthority
-rw-------   1 johnson  csdl          916 Jan 22  2004 .Xauthority
-rw-r-----   1 johnson  csdl          612 Jul 22  1999 .Xdefaults
</result>
</cli-logger-entry>

</cli-logger-entry>

<invocation time="1134732837465"current-directory="/user/home/johnson/svn/hackyCore_Build"machine="bertha.ics.hawaii.edu">

ant -q quickStart
</invocation>
<result time="11347567893">
    [echo] (12:40:04) Completed hackyCore_Build.checkModuleAvailability
    [echo] (12:40:11) Completed all.compile
    [echo] (12:40:43) Completed all.install.pre-sensorshell
    [echo] (12:41:01) Completed hackyCore_Build.installSensorShell
    [echo] (12:41:44) Completed all.install.post-sensorshell
    [echo] (12:41:44) Completed hackyCore_Build.deployTestData

BUILD FAILED

C:\svn\hackyCore_Build\tomcat.build.xml:14: Tomcat does not appear to be running onhttp://localhost:8080/

Total time: 1 minute 45 seconds
</result>
</cli-logger-entry>
</cli-logger>

One should be able to build this tool by slight modifications to an open sourcedistribution of 'script'. Another idea would be to modify the "sudoscript"<http://www.egbok.com/sudoscript/> tool.

[2] A command logger file post-processor. Tool [1] is intended to be quite generic andjust capture everything. However, we wouldn't want to send literally everything in thecommand shell off to Hackystat for a variety of reasons. Thus, Tool [2] would be apost-processor for the output of Tool [1], which would figure out what's worth savingfrom the log depending upon the specific needs of the research, and generate another XMLfile containing the actual sensor data, such as:


<sensor>

<entry tstamp="1134730020809" tool="Cli-Logger" machine="bertha.ics.hawaii.edu"command="ls -la" results=""/><entry tstamp="1134730020809" tool="Cli-Logger" machine="bertha.ics.hawaii.edu"command="ant -q quickStart" results="failed"/>

</sensor>

The details will clearly turn out to be different, but the idea is that in a particularresearch context like HPC, we might not care at all about the output from "ls -la", andonly care whether the build failed or not when invoking "ant".

[3] A Hackystat sensor for the output of the post-processor. This basically takes thepost-processed data and sends it to Hackystat.

Once the data is in Hackystat, it can be merged with other developer data such as from anIDE like Eclipse, used as input to a markov model generator or workflow analysis enginelike PROM, exported to some other environment along with other sensor data, etc.

If this seems reasonable to you all, then I would like to propose that we split up thework, with Lorin/Patrick/Andy taking responsibility for the 'front end' (i.e. Tool [1]),and the hackystat dev team taking responsibility for the 'back end' (i.e. Tool [3]). Wecan work together on Tool [2].


How does this sound?

Cheers,
Philip

[HACKYSTAT-DEV-L] Command Line Sensor/NG (design proposal)

Reply via email to