I am about to revisit the logical caches issue. My plan is to do the following to handle all these caches in a generic way:
- a singe version number is kept for all caches. - a thread starting a txn read locks an internal readwrite lock. - when a thread needs to modify a cache, it ugrades its lock to exclusive lock. If it detects a version change during this time, it throws a conflict exception. If no, it bumps up the version number and changes the cache. - After committing, thread releases the lock. -If thread aborts its txn, then it notifies interceptors in its interceptor chain of the abort. Any interceptor can then rebuild its cache from what is on disk at this point. I am assuming this is possible for all logical caches. Schema registries are a kind of cache too. If we apply the above algorithm to these, then we would not need to clone the registries while doing a change.We would do the change, and if txn aborts, schema registries would be built from on disk data again. There might be some caches this wont work for.Like if we have an entry cache that sits above partitions, this wont work for this cache purely because of perf reasons. If this is the case for any cache, then that cache has to be made MVCC. Let me know if you have any suggesstion/comments. thanks Selcuk On Mon, Oct 17, 2011 at 7:12 PM, Emmanuel Lecharny <[email protected]> wrote: > On 10/17/11 3:31 PM, Selcuk AYA wrote: >> >> Hi all, >> I am hoping to send a more detailed email as to where I am in the txn >> implementation but I noticed I might hit a snag with regard to the >> various logical caches we maintain. > > Ahha... Caches... Pain... > > We have many of them. Let's first list all the caches we are using : > - dnCache : a DN cache used to spare DN parsing. If the DN is not anymore > valid, it should be removed from the cache. > - subentryCache : a DN -> Subentry Map. We should update it when a subentry > is added/modified > - accessControlXXX caches : Caches used for AccessPoint. It's a DnNode data > structure, using the DN as an entry point. > - groupCache : a cache containing the groups > - tupleCache : a cach containing the ACI tuples > - kdcReplayCache : a Kerberos cache > - referrals cache : a cache used to manage referrals > - credentialCache : a LRUMap used for authentication > - registrations : a cache of notifications > - ObjctClass chaches (must, may, superiors, allowed) > - TriggerSpecCache A cache of id for the Triggers > - notAliasCache : This is a weird alias. It's used to know if an entry's > parent is not an alias. IMO, we should rather have an Alias cache... > > This is pretty much all the cache we declare, if we exclude the index and > master table cache (entry). > > Obviously, many of those caches will be impacted by any modification done in > the server. > >> >> Emmanuel is moving these caches out of interceptor but up until now >> these caches were in interceptors. They map entry DN to a logical >> value that is a predicate of the entry attributes. Currently I am >> aware of notAlias and subentry cache(there could be more) as such >> caches but it is not difficult to see people might add such cache in >> their custom interceptors. >> >> An example of the transactional execution we might have according to >> the planned implementation is this: >> >> R1, T1, T2 -> R1 is before T1 and T2 and should be isolated from them. >> T1 is committed and T2 started after T1, so it should see the affects >> of T1. Also changes of T1 are not reflected(flushed) to the underlying >> partitions yet. Remeber that readers merge what they read from >> partitions with the changes in the unflushed part of the txn log. >> >> Now considers how we would make R1 and T2 see a consistent state of >> notAlias and subentry cache. It seems to me the only possible way is >> to go ahead like we do with entries and index values: Update the cache >> when the txn log is being flushed to the partitions and when these >> caches read, merge whatever we read with the txn log. However, each >> separate cache requires a separate logic to handle this merge and i am >> afraid it might complicated and slow as the number of such caches >> increase. Especially expecting a custom cache implementer to get this >> right seems dubious. >> >> please let me know what you think > > > This is plain right. We currently don't handle any kind of transaction > system for cache, so we might very well end with a out of date cache at some > point. This is dangerous... OTOH, as we now are implementing a MVCC > mechanism, keeping the cach as they are is just not an option, and we must > keep the revision for entries and DN, as they might have been changed by > another thread... > > This is not an easy issue. > > In many places, we are now using ehCache to manage caches, instead of using > a LRUMap, for instance, but this is not the case for all the caches. For > instance, in some places, we are using a DnNode cache (which s used for > partitions, accessControl, etc). > > One possibility would be to associate the revision to each key, assuming > that each operation will have a revision number in its context. Any > modification impacting any cache will just create a new element with a > revision into those cache. > > In order to improve the cache management, it would be good to always use the > CacheService, which provides some methods to easily manage caches. That may > be possible except for the DnNode cache, as it's a tree hierarchy... > > We should think seriously about the best way to solve this issue, as it's > really critical. > > Thanks Selcuk ! >> >> >> regards >> Selcuk >> > > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com >
