Re: [Components] Design: #10531: ezcCacheStorageFile is inefficient when reading the same cache repeatedly

Derick Rethans Fri, 18 Jan 2008 05:20:02 -0800

On Wed, 16 Jan 2008, Tobias Schlitt wrote:

> On 01/16/2008 10:43 AM Derick Rethans wrote:
> > On Tue, 15 Jan 2008, Tobias Schlitt wrote:
> 
> >> I started a small design doc for issue #10531 (ezcCacheStorageFile is
> >> inefficient when reading the same cache repeatedly), because it is not as 
> >> easy
> >> to solve as desribed in the issue (in terms on consitency).
> >>
> >> Please take a look and comment. Find the document attached or in
> >> Cache/trunk/design-1.4.txt.
> 
> > Some comments:
> 
> > - I don't think that "store" should store anything in the memory cache. 
> >   It wouldn't make much sense because this data is in memory already 
> >   anyway, and secondly, I don't think you store and restore the same 
> >   cache data in the same request.
> 
> That depends. To me this sounds as unusual as restoring one and the same 
> cache item several times in the same request. Beside that, if you 
> restored the item once, it is in memory, too. Which would mean that we 
> do not need this functionality at all.


No, that's slightly different. If you restore it then it is in the local 
memory scope - you have to take care of caching yourself then. So it's 
quite different whether restore stores it in the memory cache itself, or 
whether you have to do it (like now). For both ways, "store" doesn't 
really have to store anything in the memory cache. It's fine if it 
should up there if it's restored once though.

> If we consider the use-case of restoring a cache item multiple times 
> during a request as valid (which is why we want to implement this 
> feature), we should also consider that data is requested in one part of 
> an application, where it is generated and stored in the cache, and that 
> other parts request this data later again, where it needs to be restored.
> 
> > - Your proposed memory cache is something specifically implemented for 
> >   the file storage backend. And the memory cache is only in-memory for 
> >   the duration of one request. Now that we have memcache and apc caches, 
> >   wouldn't it make more sense to allow for a fallback cache of some 
> >   sorts, so that you basically can tie two cache backends together. That 
> >   would allow a "fast memory, slow disk mechanism" (such as you 
> >   proposed) but also a "fast in-memory, slower memcached, slow disk 
> >   mechanism", or a "apc cache cache, and a slow disk mechanism". That'd 
> >   mean that we'd need a normal memory cache backend too.
> 
> What you basically propose is to introduce multi-level caching (as it is 
> e.g. done with processor memory caches). While I generally like the idea 
> of implementing such a system (for the fun part), I think this will take 
> a good portion of work to be realized. More about this below. Anyway, 
> for systems like e.g. eZ Publish this would make some sense to have.

Yes, that's what I meant - and, I think we should have this at some 
point. So I'd prefer a design that goes towards this.

> > - For all in-memory caches (also for apc/memcached backends to some 
> >   extend) we should have some sort of mechanism the limits the amount of 
> >   memory to be used. I think APC and Memcached have an internal limit 
> >   already, but a new in-memory cache does not have easy limits there. 
> >   Something like an LRU/LFU mechanism f.e.
> 
> As said before, I like the idea of implementing more complex caching 
> stuff and especially the strategy algorithms, for the fun part of it. 
> Anyway, I think this will add much to much complexity, especially for 
> this memory cache. The current design is simple and does not slow down 
> the file based caches too much.

It's already a problem in eZ Publish, where there is no control over how 
many persistent objects are cached in memory. It is quite some 
limitation although I think it's solved in the later versions though. I 
would find this important.

> If we go for implementing caching strategies like LRU we need to keep 
> track of more data for each cache item (e.g. the last use time) and need 
> to implement the selection algoriths, too. The utilized memory of a 
> cache item is not easily determinable if you design this kind of cache 
> as a general purpose, multi-level cache. For example: If you restore a 
> cache item from APC and store it into the memory cache for faster second 
> access, you have no idea how much memory this consumes. It could be an 
> array with only an integer element, but also one with millions of 
> objects (as an ArrayObject then).

Instead of memory usage, you can of course also limit it to the *amount* 
of cache items.

> In addition, for the idea of multi-level caching, you would need to 
> implement the caching strategies for the APC and Memcache storages, too, 
> to have this part consistent.

Hmm, that's not a real necessity... as you can just set options on 
different caching backends/levels. 

> In this sense I would say: Keep this stuff simple. If we go for 
> implementing it, then in a very simple, but efficient way. Else we 
> should leave it to the user that he does not restore the same item 
> many times during a request.

I still think it's imperative that the amount of memory/cache items can 
be limited in any sort of in-process memory cache.

regards,
Derick
-- 
Components mailing list
[email protected]
http://lists.ez.no/mailman/listinfo/components

Re: [Components] Design: #10531: ezcCacheStorageFile is inefficient when reading the same cache repeatedly

Reply via email to