I was thinking of a session-level flag to deactivate the path cache. Florent
On Tue, Nov 16, 2010 at 3:44 PM, Florian Müller <[email protected]> wrote: > Hi Florent, hi Jens, > > Would that be a fair summary of your posts: > > - We keep the path cache. > > - We add a revalidation flag > (to the OperationContext?). > > - We add an expiration time to the path-to-id mapping > (controlled by a session parameter?). > > - I would like to add: > We add a general expiration time to objects in the cache > (controlled by a session parameter?). > > If we agree on that I would rework the cache implementation. > > > @Florent: You can already deactivate caching for a getObjectByPath() call by > providing an OperationContext that has the "cache enabled" flag set to > false. That is not obvious, though. > > @Jens: Whenever you call refresh() and the object is gone or you lost the > permission to see it, it will throw an exception. We can't change the fact > that the cache might return stale objects. Even if you load a fresh object > from the repository, it might have been changed on the server a second later > ... and we don't not know that until we load it again or try to change it. > > > Cheers, > > Florian > > > > On 16/11/2010 13:45, Jens Hübel wrote: >> >> HI Florian >> >> Yes I agree to all your thoughts, but my idea was that the case where the >> object changes but the path keeps stable is one that may more weird than >> some of the others you mention. As those are not opencmis specific I am not >> sure if we should promote some of your thoughts to the OASI TC...? In >> addition the change token is also opaque to the client and maintained by the >> servers. This means that the client cannot make any assumption about its >> meaning. My thought was that if we reduce this just to check for equality >> might help in some cases. >> >> But some of your thoughts go beyond the "path is no stable id" problem and >> are inherent with caching in general. What is if I get an object by id, >> cache it and the ACL changes so that I do not have access any longer? There >> is some risk that a property has changed. What happens if the object gets >> deleted in the server? >> >> We really need to carefully document the behavior of our client lib here >> at least. >> >> One pragmatic solution would be to cache every object with a timestamp >> when it got cached. If the object is accessed from the cache we might set a >> (configurable) timeout. If the timeout is exceeded we always refresh the >> object from the server. This would give at least some guarantee that a stale >> object will only live for a certain period of time (lets say 30mins or so). >> For highly sensitive scenarios this timeout might be reduced or set to zero. >> The timestamp should be associated with the key (e.g. path) and not the >> value of the cache (object). >> >> I fear there won't be a perfect solution as you already said... >> >> Jens >> >> >> >> -----Original Message----- >> From: Florian Müller [mailto:[email protected]] >> Sent: Dienstag, 16. November 2010 11:40 >> To: [email protected] >> Subject: Re: getObjectByPath cache problem >> >> The change token can only be used to detect changes within an object. >> The problem here is that we are potentially dealing with two objects. >> The root of the problem is that a path is not a stable key for an object. >> >> An object can be updated in the repository without our knowledge. The >> cache would then return an outdated object and everybody should be >> prepared for that. refresh() reloads the current state from the >> repository. That works fine if the object is retrieved through >> getObject(). You can move the object around, unfile it, put it in >> multiple folders and it still works. The object id is unambiguous. >> >> The cache currently maps object paths to object ids. When you call >> getObjectByPath() it will look up the id for this path and gets the >> object from the cache. If you move the object to a different folder, >> getObjectByPath() shouldn't find it anymore. The path of the object has >> changed and the old path is now invalid. Note that the object hasn't >> changed and therefore the change token hasn't either. >> Since there is no notification from the repository, the path-to-id >> mapping can't be corrected. The cache still thinks the object is >> accessible through this path. So getObjectByPath() returns the object >> although it should throw a CmisObjectNotFound exception. >> >> Let's assume we create a new object in the place where the old object >> was. The new object can now be accessed with the old path. Since the >> outdated path-to-id mapping is still in place, getObjectByPath() returns >> the old object and not the new one -- which clearly wrong. >> >> The problem that we are facing here is that there is no reliable way to >> keep the path-to-id mapping up-to-date. If we want to be correct, we >> would have to ask the repository for the current id for the given path >> every time getObjectByPath() is called (3) -- or not use the cache at >> all (2). >> >> >> - Florian >> >> >> >> >> On 16/11/2010 08:23, Jens Hübel wrote: >>> >>> Shouldn't the change token solve that? How do we deal with the change >>> token for other objects that are in the local cache? Ignore? Check on each >>> access? Configurable? >>> >>> Jens >>> >>> >>> -----Original Message----- >>> From: Florian Müller [mailto:[email protected]] >>> Sent: Montag, 15. November 2010 21:17 >>> To: [email protected] >>> Subject: getObjectByPath cache problem >>> >>> Hi all, >>> >>> I had another look at [1]. Unfortunately, it's an unsolvable problem. >>> I can think of three ways to cope with it: >>> >>> 1. We leave it like it is, although it is very, very confusing when you >>> run into this situation. >>> >>> 2. We don't cache by path. How would that affect applications? >>> >>> 3. When getObjectByPath is called we fetch the object id from the >>> repository and then get the object from the cache. >>> In the worst case, we would have to hit the repository twice. >>> >>> >>> Any opinions? >>> >>> - Florian >>> >>> >>> [1] https://issues.apache.org/jira/browse/CMIS-260 >> > > -- Florent Guillaume, Director of R&D, Nuxeo Open Source, Java EE based, Enterprise Content Management (ECM) http://www.nuxeo.com http://www.nuxeo.org +33 1 40 33 79 87
