On Mon, 7 Apr 2008, Tobias Schlitt wrote:
> I just finished a the first draft of my design for hierarchical (multi
> level) caching in the Cache component. Please take some minutes and
> review my proposal. You can find the RST in SVN under
>
> trunk/Cache/design/design-1.4.txt
Some comments:
> Hierarchical stacking
> ---------------------
[snip]
> - **Bubbling restored data up through the cache stack**
> According to the replacement strategy, cache items need to be placed into
> higher levels of the hierarchy, as soon as they get restored from a deeper
> level, to make them available faster on subsequent restore requests. For
> more
> information see the `Replacement strategies`_ section.
Sometimes you might not want the items stored in a higher stack level,
for example because of data size reasons. As an example, you want a 1mb
PDF file stored in a cache on disk, but definitely not in the memory
cache. How are we going to handle this?
> Propagate on replacement
> ^^^^^^^^^^^^^^^^^^^^^^^^
>
> Using this strategy, a newly stored cache item is only put into the top most
> storage in the hierarchy. As soon as it needs to be replaced there, it is
> propagated down one level, before being removed from the higher level cache.
>
> Pros:
>
> - The initial storage of a cache item is faster, since it only affects one
> storage. In addition, this should be the fastest storage (the top most).
> - Purging of an item does only affect 1 single cache.
> - If all storages reached their maximum number of stored items, only a single
> item is bubbled down to the lowest level.
>
> Cons:
>
> - Additional work is to be done on each replacement of a cache item.
Another con:
- If the cache storage disappears (for an in-memory storage) then it
hasn't be cached in the lower levels yet.
I think we should not implement this at all, but just the "Propagate on
store" variant. It can also make things simpler in the API and thus
provide more performance.
> ezcCacheStack
> -------------
>
> An object of the ezcCacheStack class is the main instance to provide the
> hierarchical stack mechanism. The stack object takes care of managing several
> cache storages, the unified access for storing and restoring cache items and
> the associated objects needed to realize this.
[snip]
> class ezcCacheStack extends ezcCacheStorage
> {
> public function __construct( $location, $options );
> public function store( $id, $data, $attributes = array() );
> public function restore( $id, $attributes, $search );
> public function delete( $id, $attributes, $search );
> public function countDataItems( $id, $attributes );
> public function getRemainingLifetime( $id, $attributes );
>
> public function getStackedCaches();
> }
I miss a method to add a new storage to the stack. I know you mention
this as an option in "ezcCacheStackOptions", but I don't think this fits
as an option easily. I would go for a method that adds a new storage
configuration to the bottom of the "stack" here.
> ezcCacheStackableStorage
> ------------------------
>
> The interface ezcCacheStackableStorage is used to ensure, that storage classes
> that can be stacked implement the necessary functionality. The following
> methods are needed: ::
>
> interface ezcCacheStackableStorage
> {
> restoreMetaInfo();
> storeMetaInfo( array $metaInfo );
>
> purge();
Maybe we can add an option to purge() to clear out the whole cache?
[snip]
> The purge method is needed to make the storage purge all outdated items. In
> case a cache storage runs full (determined by the replacement strategy), first
> all outdated items will be purged, before items are deleted using the original
> strategy. The purge() method needs to return the IDs, attributes and data of
> the purged items to allow the replacement and storage strategy objects to
> update their information.
I am not sure if that it's very wise to return the data as well. This
can be a lot of stuff...
> ezcCacheStackOptions
> --------------------
>
> An object of this class is used to configures the cache stack. It extends the
> ezcCacheStorageOptions class, to be compatible with all other mechanisms. The
> 'ttl' and 'extension' options are ignored, because each of the stacked caches
> must be able to implement its own set of options. The following options are
> part of this class:
>
> 'storageStrategy'
> This option contains a class name, which is to be instantiated to perform
> storage operations in the stack. The class must extend the abstract
> ezcCacheStackStorageStrategy class.
> 'storages'
> This option is an array of ezcCacheStackStorageConfiguration objects, that
> will be used to define the cache storages contained in the stack. Per
> default, no storages will be defined. In this case, a call to any of the
> methods defined by ezcCacheStorage will result in an exception.
I don't think we should either of those options. As I just mentioned,
storages should be maintained through methods and the first one,
storageStrategy I suggest not to have at all to make the implementation
easier. That means no different stack-storage-strategies.
> ezcCacheStackStorageConfiguration
> ---------------------------------
>
> An instance of this struct like class is used in the 'storages' option of
> ezcCacheStackOptions to define the configuration of a single cache storage. It
> contains the 3 parameters necessary to instantiate a new cache storage, as
> well
> as additional information, needed by the ezcCacheStack instance and its
> aggregated objects. The properties are contained in the class:
[snip]
> 'freeRate'
> This option is an integer that indicates a percentage value. In case the
> cache storage runs full, this amount of items will be removed. This
> mechanism ensures that running full of a cache does not occur too often.
I don't think this should be an integer in procent, but instead just a
floating point from 0 to 1. This is also mentioned a bit further here:
> ezcCacheReplacementStrategy
> ---------------------------
[snip]
> The constructor of a replacement strategy receives a ready to use storage
> object, which fulfills all needs by implementing the ezcCacheStackableStorage
> interface. The $limit parameter indicates the maximum number of items to be
> stored in the storage. The free rate is a percentage value, that indicates how
> many items are to be purged, whenever a cache runs full.
[snip]
> The restore() method does not work as complex as the store() method does. Its
> algorithm is defined by the following pseudo code: ::
>
> $storage->lock();
> $item = $storage->restore(...);
>
> if ( $item !== false )
> {
> // Notice access to this item in meta information
> $meta = $storage->restoreMetaInfo();
> update( $meta );
> $storage->storeMetaInfo( $meta );
If $meta would be an object, or returned by reference, there would be no
need for storeMetaInfo() right here.
One thing that I miss in the design is the performance impact on
restoring items from the cache, perhaps you could add something about
that?
regards,
--
Derick Rethans
eZ components Product Manager
eZ systems | http://ez.no
--
Components mailing list
[email protected]
http://lists.ez.no/mailman/listinfo/components