Hi Pablo, > It seems that the GC needs more information from the Data Store to do > its job. One approach may be to have "transient" and "persisted" binary > content so that the GC doesn't delete "transient" binary content.
Yes, I agree, that would be the best solution. Or: never delete data where a reference is in memory (no matter if it's transient). What about storing all DataRecord objects in a WeakHashMap. Then there are two solutions: Plan A) the garbage collection just checks the hash map and doesn't delete those. Plan B) there is a background daemon thread that updates the modified date from time to time. Plan A sounds simpler, Plan B would solve some distributed GC problems. > The GC can use observation to be notified > of property creation, as currently does to detect when nodes are moved > during GC scan. We can use a mark file for the binary content state in > the FileDataStore implementation and an additional column in the binary > content table for the DatabaseDataStore. I think that would work as well, but the observation solution would be more complex in my view. > What is the overhead of a PROPERTY_ADDED + PROPERTY_CHANGED listener in > Jackrabbit ? I don't know. If it must observe all properties all the time, probably it should be avoided. If there is a way to only observe binary properties, or only while the GC is running, then it should be OK. Thomas
