Hi Aaron,
Here's one way to think about soft references, caching, and so forth that might help.
We use soft references if and only if we are implementing a cache, where "cache" is defined in a specific way. A "cache" in this interpretation is no more and no less than an efficiency hack--a more efficient way to obtain information that could always be obtained from some more expensive process. An important part of the semantics of "cache" is that an entry in the cache can be GC'd at any moment by the runtime environmment, even immediately after creation, and the only impact is on performance.
Caches are thus different from "temporary data structures". In DailyProjectCoverage, we build temporary data structures to hold raw data as part of figuring out what the latest coverage sensor data is for use in the abstraction. A temporary data structure cannot be GC'd at any arbitrary time by the runtime environment--if an entry got GC'd immediately after it was added, then the temporary data structure would produce incorrect results.
On the other hand, once we finish with the use of the temporary data structure, we'd like to be able to tell the runtime environment that this space can be GC'd. One way to do that is if the entire temporary data structure is pointed to by a single instance variable, then after we complete processing on it, we can simply set that instance variable to null. Doing that makes the temporary data structure unreachable and thus available for GC.
In summary, we want both caches and temporary data structures to be made available for GC. The difference is whether we need to control when the GC can occur.
Cheers, Philip
--On Saturday, November 20, 2004 12:57 AM -1000 Aaron Kagawa <[EMAIL PROTECTED]> wrote:
Hey Guys,
I have a question about soft references. But, first I must provide you with some background information.
I'm working on the Project level ProjectJavaClassWorkspaceMapManager. And I'm investigating substituting use of the personal level JavaClassWorkspaceMapManager with the project level manager in DailyProjectUnitTest and DailyProjectCoverage. This is a bit of a struggle because the two DailyProjectData implementations are quite different. I am noticing that the DailyProjectUnitTest implementation is quite simple and elegant. On the other hand, the DailyProjectCoverage implementation is quite complicated and could have some problems with it. The potential problem I'll address now deals with Soft References.
From what I understand the good thing about Soft References is that even though we have a valid Soft Reference to an object, it still can be GCed. Hopefully, that is right. Although, I'm not totally clear on what the differences between Soft and Weak References.
In Hackystat the SensorDataCache uses a ThreeKeyCache which provides Soft References. So, the idea is that the DailyProjectData implementations should read through the SensorDataEntry's, which is obtained from the SensorDataCache, and create a project level representation of the sensor data. If space is needed and GC happens the SensorDataEntry's can be GC'ed no problem. (right?)
Anyway, it appears that the DailyProjectCoverage implementation creates a "hard" or regular reference. Here is some code that puts a SensorDataEntry (in this case Coverage) into the runtimeMap, which is a TreeMap (the code is a little out of order, but it should show you that we are storing SensorDataEntry's in something other than the SensorDataCache).
updateRuntimeMap(entry, workspaceFile); .... coverageWorkspaceMap.put(entry, workspaceFiles); ... runtimeMap.put(entry.getRunTime(), coverageWorkspaceMap);
So, (and this is another wild guess) this reference would supersede the Soft Reference and the Coverage SensorDataEntry will never be GC'ed.. Right? Hopefully, I'm totally wrong and totally confused. OK... well, this isn't totally true. It turns out that the DailyProjectCache also uses Soft References from ThreeKeyCache and the whole DailyProjectData instance can be GC'ed. I believe Hongbing mentioned this a year ago, which one is going to be GC'ed first? It would be really cool if it was possible to GC the SensorDataEntries first then go up a level to the DailyProjectData.
Anyway, I've probably confused you by now.. but the bottom line question is.... Is holding a collection SensorDataEntries in anything other than the SensorDataCache is a good idea?
thanks, aaron
ps.. Whoa! DailyProjectJavaFileMetric does the same thing. It appears that any time we have a runtime attribute in the sensor data we are using some sort of map to store the runtimes and the entries.
