[HACKYSTAT-DEV-L:366] Re: DailyProjectCoverage and soft references

Philip Johnson Sun, 21 Nov 2004 10:13:33 -0800

Hi Aaron,

Here's one way to think about soft references, caching, and so forth that might help.

We use soft references if and only if we are implementing a cache, where "cache" is defined in a specific way. A "cache" in this interpretation is no more and no less than an efficiency hack--a more efficient way to obtain information that could always be obtained from some more expensive process. An important part of the semantics of "cache" is that an entry in the cache can be GC'd at any moment by the runtime environmment, even immediately after creation, and the only impact is on performance.

Caches are thus different from "temporary data structures". In DailyProjectCoverage, we build temporary data structures to hold raw data as part of figuring out what the latest coverage sensor data is for use in the abstraction. A temporary data structure cannot be GC'd at any arbitrary time by the runtime environment--if an entry got GC'd immediately after it was added, then the temporary data structure would produce incorrect results.

On the other hand, once we finish with the use of the temporary data structure, we'd like to be able to tell the runtime environment that this space can be GC'd. One way to do that is if the entire temporary data structure is pointed to by a single instance variable, then after we complete processing on it, we can simply set that instance variable to null. Doing that makes the temporary data structure unreachable and thus available for GC.

In summary, we want both caches and temporary data structures to be made available for GC. The difference is whether we need to control when the GC can occur.

Cheers,
Philip

--On Saturday, November 20, 2004 12:57 AM -1000 Aaron Kagawa <[EMAIL PROTECTED]> wrote:

Hey Guys,

I have a question about soft references. But, first I must provide you
with some background information.

I'm working on the Project level ProjectJavaClassWorkspaceMapManager. And
I'm investigating substituting use of the personal level
JavaClassWorkspaceMapManager with the project level manager in
DailyProjectUnitTest and DailyProjectCoverage.  This is a bit of a
struggle because the two DailyProjectData implementations are quite
different.  I am noticing that the DailyProjectUnitTest implementation is
quite simple and elegant. On the other hand, the DailyProjectCoverage
implementation is quite complicated and could have some problems with it.
The potential problem I'll address now deals with Soft References.

From what I understand the good thing about Soft References is that even
though we have a valid Soft Reference to an object, it still can be GCed.
Hopefully, that is right. Although, I'm not totally clear on what the
differences between Soft and Weak References.

In Hackystat the SensorDataCache uses a ThreeKeyCache which provides Soft
References. So, the idea is that the DailyProjectData implementations
should read through the SensorDataEntry's, which is obtained from the
SensorDataCache, and create a project level representation of the sensor
data.  If space is needed and GC happens the SensorDataEntry's can be
GC'ed no problem. (right?)

Anyway, it appears that the DailyProjectCoverage implementation creates a
"hard" or regular reference. Here is some code that puts a
SensorDataEntry (in this case Coverage) into the runtimeMap, which is a
TreeMap (the code is a little out of order, but it should show you that
we are storing SensorDataEntry's in something other than the
SensorDataCache).

   updateRuntimeMap(entry, workspaceFile);
....
   coverageWorkspaceMap.put(entry, workspaceFiles);
...
   runtimeMap.put(entry.getRunTime(), coverageWorkspaceMap);


So, (and this is another wild guess) this reference would supersede the
Soft Reference and the Coverage SensorDataEntry will never be GC'ed..
Right? Hopefully, I'm totally wrong and totally confused. OK... well,
this isn't totally true. It turns out that the DailyProjectCache also
uses Soft References from ThreeKeyCache and the whole DailyProjectData
instance can be GC'ed.  I believe Hongbing mentioned this a year ago,
which one is going to be GC'ed first? It would be really cool if it was
possible to GC the SensorDataEntries first then go up a level to the
DailyProjectData.

Anyway, I've probably confused you by now.. but the bottom line question
is.... Is holding a collection SensorDataEntries in anything other than
the SensorDataCache is a good idea?

thanks, aaron

ps..  Whoa! DailyProjectJavaFileMetric does the same thing.  It appears
that any time we have a runtime attribute in the sensor data we are using
some sort of map to store the runtimes and the entries.

[HACKYSTAT-DEV-L:366] Re: DailyProjectCoverage and soft references

Reply via email to