On Mon, 7 Apr 2008, Tobias Schlitt wrote:

> I just finished a the first draft of my design for hierarchical (multi
> level) caching in the Cache component. Please take some minutes and
> review my proposal. You can find the RST in SVN under
> 
> trunk/Cache/design/design-1.4.txt

Some comments:

> Hierarchical stacking
> ---------------------

[snip]

> - **Bubbling restored data up through the cache stack**
>   According to the replacement strategy, cache items need to be placed into
>   higher levels of the hierarchy, as soon as they get restored from a deeper
>   level, to make them available faster on subsequent restore requests. For 
> more
>   information see the `Replacement strategies`_ section.

Sometimes you might not want the items stored in a higher stack level,
for example because of data size reasons. As an example, you want a 1mb
PDF file stored in a cache on disk, but definitely not in the memory
cache. How are we going to handle this?

> Propagate on replacement
> ^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Using this strategy, a newly stored cache item is only put into the top most
> storage in the hierarchy. As soon as it needs to be replaced there, it is
> propagated down one level, before being removed from the higher level cache.
> 
> Pros:
> 
> - The initial storage of a cache item is faster, since it only affects one
>   storage. In addition, this should be the fastest storage (the top most).
> - Purging of an item does only affect 1 single cache.
> - If all storages reached their maximum number of stored items, only a single
>   item is bubbled down to the lowest level.
> 
> Cons:
> 
> - Additional work is to be done on each replacement of a cache item.

Another con: 

- If the cache storage disappears (for an in-memory storage) then it
  hasn't be cached in the lower levels yet.

I think we should not implement this at all, but just the "Propagate on
store" variant. It can also make things simpler in the API and thus
provide more performance.

> ezcCacheStack
> -------------
> 
> An object of the ezcCacheStack class is the main instance to provide the
> hierarchical stack mechanism. The stack object takes care of managing several
> cache storages, the unified access for storing and restoring cache items and
> the associated objects needed to realize this.

[snip]

>     class ezcCacheStack extends ezcCacheStorage
>     {
>         public function __construct( $location, $options );
>         public function store( $id, $data, $attributes = array() );
>         public function restore( $id, $attributes, $search );
>         public function delete( $id, $attributes, $search );
>         public function countDataItems( $id, $attributes );
>         public function getRemainingLifetime( $id, $attributes );
> 
>         public function getStackedCaches();
>     }

I miss a method to add a new storage to the stack. I know you mention
this as an option in "ezcCacheStackOptions", but I don't think this fits
as an option easily. I would go for a method that adds a new storage
configuration to the bottom of the "stack" here.

> ezcCacheStackableStorage
> ------------------------
> 
> The interface ezcCacheStackableStorage is used to ensure, that storage classes
> that can be stacked implement the necessary functionality. The following
> methods are needed: ::
> 
>     interface ezcCacheStackableStorage
>     {
>         restoreMetaInfo();
>         storeMetaInfo( array $metaInfo );
> 
>         purge();

Maybe we can add an option to purge() to clear out the whole cache?

[snip]

> The purge method is needed to make the storage purge all outdated items. In
> case a cache storage runs full (determined by the replacement strategy), first
> all outdated items will be purged, before items are deleted using the original
> strategy. The purge() method needs to return the IDs, attributes and data of
> the purged items to allow the replacement and storage strategy objects to
> update their information.

I am not sure if that it's very wise to return the data as well. This
can be a lot of stuff...

> ezcCacheStackOptions
> --------------------
> 
> An object of this class is used to configures the cache stack. It extends the
> ezcCacheStorageOptions class, to be compatible with all other mechanisms. The
> 'ttl' and 'extension' options are ignored, because each of the stacked caches
> must be able to implement its own set of options. The following options are
> part of this class:
> 
> 'storageStrategy'
>     This option contains a class name, which is to be instantiated to perform
>     storage operations in the stack. The class must extend the abstract
>     ezcCacheStackStorageStrategy class.
> 'storages'
>     This option is an array of ezcCacheStackStorageConfiguration objects, that
>     will be used to define the cache storages contained in the stack. Per
>     default, no storages will be defined. In this case, a call to any of the
>     methods defined by ezcCacheStorage will result in an exception. 

I don't think we should either of those options. As I just mentioned,
storages should be maintained through methods and the first one,
storageStrategy I suggest not to have at all to make the implementation
easier. That means no different stack-storage-strategies.

> ezcCacheStackStorageConfiguration
> ---------------------------------
> 
> An instance of this struct like class is used in the 'storages' option of
> ezcCacheStackOptions to define the configuration of a single cache storage. It
> contains the 3 parameters necessary to instantiate a new cache storage, as 
> well
> as additional information, needed by the ezcCacheStack instance and its
> aggregated objects. The properties are contained in the class:

[snip]

> 'freeRate'
>     This option is an integer that indicates a percentage value. In case the
>     cache storage runs full, this amount of items will be removed. This
>     mechanism ensures that running full of a cache does not occur too often.

I don't think this should be an integer in procent, but instead just a
floating point from 0 to 1. This is also mentioned a bit further here:

> ezcCacheReplacementStrategy
> ---------------------------

[snip]

> The constructor of a replacement strategy receives a ready to use storage
> object, which fulfills all needs  by implementing the ezcCacheStackableStorage
> interface. The $limit parameter indicates the maximum number of items to be
> stored in the storage. The free rate is a percentage value, that indicates how
> many items are to be purged, whenever a cache runs full.

[snip]

> The restore() method does not work as complex as the store() method does. Its
> algorithm is defined by the following pseudo code: ::
> 
>     $storage->lock();
>     $item = $storage->restore(...);
> 
>     if ( $item !== false )
>     {
>         // Notice access to this item in meta information
>         $meta = $storage->restoreMetaInfo();
>         update( $meta );
>         $storage->storeMetaInfo( $meta );

If $meta would be an object, or returned by reference, there would be no
need for storeMetaInfo() right here.

One thing that I miss in the design is the performance impact on
restoring items from the cache, perhaps you could add something about
that?

regards,
-- 
Derick Rethans
eZ components Product Manager
eZ systems | http://ez.no
-- 
Components mailing list
[email protected]
http://lists.ez.no/mailman/listinfo/components

Reply via email to