[From Aaron]

Hey Guys,

Today, I spent a great deal of time (most of it being terribly frustrated) 
figuring out a
problem with the Coverage data obtainable in the Daily Project Details 
analysis. It turns
out it was a single line of code!  In an effort to document the problem, here 
is a
detailed description of what happened:

I closely monitor the 414 students' development of CLEW using the Daily Project 
Details
analysis.  Over these past weeks, I've been noticing that the coverage section 
of the
analysis was not showing my data (my data contains coverage of all the CLEW 
modules, the
students in 414 only build and test a portion of the system). The following 
link is the
results I would get: <
http://csdl.ics.hawaii.edu/~kagawaa/clew_coverage/daily_project_details.htm >. 
Notice the
Coverage section only contains three modules, yet the File Metrics and Unit 
Tests have
all modules (which come from my Builds of the entire system). I realized that 
the
DailyProjectCoverage should take the set of the entries that have the latest 
runtime
stamp. So I built, tested and sent coverage data three times in a row, to try 
to ensure
that my data had the latest runtime stamp. Ran the analysis and.. nothing 
changed.
Scratching my head, I pointed the finger at the Java Class Workspace Map.. I 
checked the
following analyses:
< http://csdl.ics.hawaii.edu/~kagawaa/clew_coverage/coverage.htm>
< http://csdl.ics.hawaii.edu/~kagawaa/clew_coverage/file_metric.htm >
< http://csdl.ics.hawaii.edu/~kagawaa/clew_coverage/java_class_workspace.htm >

All of these analyses show that the Java Class Workspace Manager is correct and 
that I
should have correct coverage information.. So. I gave up at this point. I came 
to the
conclusion that a member of the project has misconfigured their system's clock. 
 A few
hours later, I came back to my senses and downloaded the Projects Sensor Data 
using the
Admin Zip Project Data command.

I created a Project on my local installation of Hackystat, with me being the 
only member,
and ran the Daily Project Data analysis. Everything was perfect! Dang!  The 
next step,
was too add all the members to the Project and re-run the analysis. As expected 
the same
bogus problem appeared. So, I dove into the Sensor XML files. I confirmed that 
my
coverage data had the latest runtime stamp. No help there, so I dove into the
DailyProjectCoverage code.

I quickly found a method called updateRuntimeMap, where we manage the entries 
according
to the runtime stamp. I added some System.out.println's in there. After about 
30 builds,
manual testing, and different placements of prints.. I found the problem on one 
line of
code!

--------------------------------------------------------
This is wrong and buggy code:
   // If there are old data we just ignore the obsolete data.
   if (this .runtimeMap.size() > 0) {
     String savedRuntime = (String) this .runtimeMap.firstKey(); // heres the 
bogus line
of code!!!!!!
     // Clear the map if it contains old data.
     if (Long.parseLong(savedRuntime) < Long.parseLong(entry.getRunTime())) {
       System.out.println( "savedRuntime=" + new 
Date(Long.parseLong(savedRuntime))
           + ", entry.getRunTime()=" + new 
Date(Long.parseLong(entry.getRunTime())));
       System.out.println( "runtimeMap.clear()" );
       this .runtimeMap.clear();
     }
   }

Here is some print out associated with the bugy code:
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
   runtimeMap.put(Sun Feb 13 12:20:07 HST 2005)
       savedRuntime=Sun Feb 13 12:20:07 HST 2005, entry.getRunTime()=Sun Feb 13 
12:44:44
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 12:44:44 HST 2005)
       savedRuntime=Sun Feb 13 12:44:44 HST 2005, entry.getRunTime()=Sun Feb 13 
17:24:59
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 17:24:59 HST 2005)
       savedRuntime=Sun Feb 13 17:24:59 HST 2005, entry.getRunTime()=Sun Feb 13 
22:04:56
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 22:04:56 HST 2005) // THIS IS THE LATEST TIME STAMP
Processing Coverage Data From [EMAIL PROTECTED]
   runtimeMap.put(Sun Feb 13 16:00:32 HST 2005)
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
   runtimeMap.put(Sun Feb 13 08:44:54 HST 2005)
       savedRuntime=Sun Feb 13 08:44:54 HST 2005, entry.getRunTime()=Sun Feb 13 
08:47:32
HST 2005
       runtimeMap.clear()  // DANG!! WE JUST CLEARED THE LATEST TIME STAMP
   runtimeMap.put(Sun Feb 13 08:47:32 HST 2005)
       savedRuntime=Sun Feb 13 08:47:32 HST 2005, entry.getRunTime()=Sun Feb 13 
08:57:23
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 08:57:23 HST 2005)
       savedRuntime=Sun Feb 13 08:57:23 HST 2005, entry.getRunTime()=Sun Feb 13 
15:06:42
HST 2005
       runtimeMap.clear()
       runtimeMap.put(Sun Feb 13 15:06:42 HST 2005)
runtimeMap.lastKey()=1108343202000
Printing all the keys in the runtimeMap
key=1108343202000

1108343202000 - this value is definitely not the latest runtime stamp!
1108368296000 - this value is the correct latest runtime stamp! it isn't even 
in the
runtimeMap!

--------------------------------------------------------
This is the correct code:
   // If there are old data we just ignore the obsolete data.
   if (this .runtimeMap.size() > 0) {
     String savedRuntime = (String) this .runtimeMap.lastKey(); // needed to 
check the
lastKey not the first
     // Clear the map if it contains old data.
     if (Long.parseLong(savedRuntime) < Long.parseLong(entry.getRunTime())) {
       System.out.println( "savedRuntime=" + new 
Date(Long.parseLong(savedRuntime))
           + ", entry.getRunTime()=" + new 
Date(Long.parseLong(entry.getRunTime())));
       System.out.println( "runtimeMap.clear()" );
       this .runtimeMap.clear();
     }
   }

Here is some print out associated with the correct code:
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
   runtimeMap.put(Sun Feb 13 12:20:07 HST 2005)
       savedRuntime=Sun Feb 13 12:20:07 HST 2005, entry.getRunTime()=Sun Feb 13 
12:44:44
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 12:44:44 HST 2005)
       savedRuntime=Sun Feb 13 12:44:44 HST 2005, entry.getRunTime()=Sun Feb 13 
17:24:59
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 17:24:59 HST 2005)
       savedRuntime=Sun Feb 13 17:24:59 HST 2005, entry.getRunTime()=Sun Feb 13 
22:04:56
HST 2005
       runtimeMap.clear()
   runtimeMap.put(Sun Feb 13 22:04:56 HST 2005) // THIS IS THE LATEST TIME STAMP
Processing Coverage Data From [EMAIL PROTECTED]
   runtimeMap.put(Sun Feb 13 16:00:32 HST 2005)
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
Processing Coverage Data From [EMAIL PROTECTED]
   runtimeMap.put(Sun Feb 13 08:44:54 HST 2005)
   runtimeMap.put(Sun Feb 13 08:47:32 HST 2005)
   runtimeMap.put(Sun Feb 13 08:57:23 HST 2005)
   runtimeMap.put(Sun Feb 13 15:06:42 HST 2005)
runtimeMap.lastKey()=1108368296000
Printing all the keys in the runtimeMap
key=1108320294000
key=1108320452000
key=1108321043000
key=1108343202000
key=1108346432000
key=1108368296000

1108368296000 - this value is the correct latest runtime stamp!
--------------------------------------------------------
Conclusion:
It is interesting and totally bad that this bug has been in the code since.... 
forever.

One line of code took hours to debug! In fact, the technique clearing the map 
when a new
latest runtime stamp is also faulty. We shouldn't be clearing the whole Map. 
Instead, we
checking for a later timestamp adding that to the map and removing the earlier 
map entry.
That way, only one entry (namely the latest runtime stamp) will be stored in 
Map. This
technique saves much more space and is much easier to understand.

I haven't committed my changes yet. I definitely am going to take a break and 
get back to
this tomorrow.  I plan to write a test case and/or add more entries to the test 
coverage
xml data to ensure that there is more than one runtime stamp. Also, I'll 
download the
hacky2004-all data for this past couple days and make sure that works as well. 
(btw, we
never had a problem with the hacky2004-all coverage because only hackystat-l 
sends
coverage data.)

I also propose a change to the Daily Project Details analysis to provide the 
Time
associated with the "snap-shot" type Data Drill Downs. For example, the 
Coverage by Top
Workspace drill down should provide the Time associated with that information. 
A sort of
"Coverage as of 11:32 AM" label. Same goes for the FileMetric drill down. This 
will help
information validation for the users of the analysis.

This brings up another problem. The DailyProjectData classes with Snap-Shot 
(Runtime
stamp) issues are terribly more complicated than Aggregate-type 
DailyProjectData classes
(ie. ActiveTime, Commits, UnitTest). To make matters worse. The 
DailyProjectCoverage and
DailyProjectJavaFileMetric does things completely differently (i mentioned this 
a long
time ago). Why is it so difficult to deal with Snap-Shot type information? Does 
the lower
information levels need to be improved to better support Snap-Shots? For 
example, why
should the DailyProjectData classes read through all the sensor data in a Day 
when we
only care about the entries with the latest runtime stamp. Or do we have to 
come up with
a better algorithm to support Snap-Shot type DailyProjectData classes? To date, 
I can
count 4 snap-shots classes; Coverage, FileMetric, Dependency, Issues, and 
probably
Performance. I imagine there will be more and we don't want each class to 
re-invent the
wheel (a complicated wheel, I might add) each time. I think we need to make an
improvement somewhere.

thanks, aaron

ps.. As I was finishing up this email I noticed another problem. In the Daily 
Project
Details for the hacky2004-all for Feb 13, 2005 there is some Unit Test data 
from the
hackyJira module. the information says that there were 11 successful tests (3 by
hackystat-l and 8 by me). However, hackyJira is not included in the 
Hackystat-All or
Hackystat-UH build configurations and I haven't even checkout hackyJira. I just 
figured
out that the problem was created a few months ago by moving code into 
hackyIssue that was
once in hackyJira. Therefore, the mapping between class name and file name 
exists in the
past and therefore still exists in the future. What are we supposed to do about 
that?
Basically, it seems that the Unit Test data from hackyJira will always be wrong.

Reply via email to