I was thinking of a session-level flag to deactivate the path cache.

Florent

On Tue, Nov 16, 2010 at 3:44 PM, Florian Müller
<[email protected]> wrote:
> Hi Florent, hi Jens,
>
> Would that be a fair summary of your posts:
>
> - We keep the path cache.
>
> - We add a revalidation flag
>  (to the OperationContext?).
>
> - We add an expiration time to the path-to-id mapping
>  (controlled by a session parameter?).
>
> - I would like to add:
>  We add a general expiration time to objects in the cache
>  (controlled by a session parameter?).
>
> If we agree on that I would rework the cache implementation.
>
>
> @Florent: You can already deactivate caching for a getObjectByPath() call by
> providing an OperationContext that has the "cache enabled" flag set to
> false. That is not obvious, though.
>
> @Jens: Whenever you call refresh() and the object is gone or you lost the
> permission to see it, it will throw an exception. We can't change the fact
> that the cache might return stale objects. Even if you load a fresh object
> from the repository, it might have been changed on the server a second later
> ... and we don't not know that until we load it again or try to change it.
>
>
> Cheers,
>
> Florian
>
>
>
> On 16/11/2010 13:45, Jens Hübel wrote:
>>
>> HI Florian
>>
>> Yes I agree to all your thoughts, but my idea was that the case where the
>> object changes but the path keeps stable is one that may more weird than
>> some of the others you mention. As those are not opencmis specific I am not
>> sure if we should promote some of your thoughts to the OASI TC...? In
>> addition the change token is also opaque to the client and maintained by the
>> servers. This means that the client cannot make any assumption about its
>> meaning. My thought was that if we reduce this just to check for equality
>> might help in some cases.
>>
>> But some of your thoughts go beyond the "path is no stable id" problem and
>> are inherent with caching in general. What is if I get an object by id,
>> cache it and the ACL changes so that I do not have access any longer? There
>> is some risk that a property has changed. What happens if the object gets
>> deleted in the server?
>>
>> We really need to carefully document the behavior of our client lib here
>> at least.
>>
>> One pragmatic solution would be to cache every object with a timestamp
>> when it got cached. If the object is accessed from the cache we might set a
>> (configurable) timeout. If the timeout is exceeded we always refresh the
>> object from the server. This would give at least some guarantee that a stale
>> object will only live for a certain period of time (lets say 30mins or so).
>> For highly sensitive scenarios this timeout might be reduced or set to zero.
>> The timestamp should be associated with the key (e.g. path) and not the
>> value of the cache (object).
>>
>> I fear there won't be a perfect solution as you already said...
>>
>> Jens
>>
>>
>>
>> -----Original Message-----
>> From: Florian Müller [mailto:[email protected]]
>> Sent: Dienstag, 16. November 2010 11:40
>> To: [email protected]
>> Subject: Re: getObjectByPath cache problem
>>
>> The change token can only be used to detect changes within an object.
>> The problem here is that we are potentially dealing with two objects.
>> The root of the problem is that a path is not a stable key for an object.
>>
>> An object can be updated in the repository without our knowledge. The
>> cache would then return an outdated object and everybody should be
>> prepared for that. refresh() reloads the current state from the
>> repository. That works fine if the object is retrieved through
>> getObject(). You can move the object around, unfile it, put it in
>> multiple folders and it still works. The object id is unambiguous.
>>
>> The cache currently maps object paths to object ids. When you call
>> getObjectByPath() it will look up the id for this path and gets the
>> object from the cache. If you move the object to a different folder,
>> getObjectByPath() shouldn't find it anymore. The path of the object has
>> changed and the old path is now invalid. Note that the object hasn't
>> changed and therefore the change token hasn't either.
>> Since there is no notification from the repository, the path-to-id
>> mapping can't be corrected. The cache still thinks the object is
>> accessible through this path. So getObjectByPath() returns the object
>> although it should throw a CmisObjectNotFound exception.
>>
>> Let's assume we create a new object in the place where the old object
>> was. The new object can now be accessed with the old path. Since the
>> outdated path-to-id mapping is still in place, getObjectByPath() returns
>> the old object and not the new one -- which clearly wrong.
>>
>> The problem that we are facing here is that there is no reliable way to
>> keep the path-to-id mapping up-to-date. If we want to be correct, we
>> would have to ask the repository for the current id for the given path
>> every time getObjectByPath() is called (3) -- or not use the cache at
>> all (2).
>>
>>
>> - Florian
>>
>>
>>
>>
>> On 16/11/2010 08:23, Jens Hübel wrote:
>>>
>>> Shouldn't the change token solve that? How do we deal with the change
>>> token for other objects that are in the local cache? Ignore? Check on each
>>> access? Configurable?
>>>
>>> Jens
>>>
>>>
>>> -----Original Message-----
>>> From: Florian Müller [mailto:[email protected]]
>>> Sent: Montag, 15. November 2010 21:17
>>> To: [email protected]
>>> Subject: getObjectByPath cache problem
>>>
>>> Hi all,
>>>
>>> I had another look at [1]. Unfortunately, it's an unsolvable problem.
>>> I can think of three ways to cope with it:
>>>
>>> 1. We leave it like it is, although it is very, very confusing when you
>>> run into this situation.
>>>
>>> 2. We don't cache by path. How would that affect applications?
>>>
>>> 3. When getObjectByPath is called we fetch the object id from the
>>> repository and then get the object from the cache.
>>>     In the worst case, we would have to hit the repository twice.
>>>
>>>
>>> Any opinions?
>>>
>>> - Florian
>>>
>>>
>>> [1] https://issues.apache.org/jira/browse/CMIS-260
>>
>
>



-- 
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Reply via email to