Re: [RFC] cache architecture

Amos Jeffries Tue, 24 Jan 2012 00:51:42 -0800

On 24/01/2012 6:16 p.m., Alex Rousskov wrote:

On 01/23/2012 07:24 PM, Amos Jeffries wrote:

This is just a discussion at present for a checkup and possibly
long-term re-design of the overall Architecture for store logics. So the
list of SHOULD DO etc will contain things Squid already does.


This post is prompted by
http://bugs.squid-cache.org/show_bug.cgi?id=3441 and other ongoing hints
about user frustrations on the help lists and elsewhere.

Getting to the chase;

  Squids existing methods of startup cache loading and error recovery are
slow with side-effects impacting bandwidth and end-user experience in
various annoying ways. The swap.state mechanism speeds loading up
enormously as compared to the DIRTY scan, but in some cases is still too
slow.


Ideal Architecture;

Squid starts with assumption of not caching.

I believe you wanted to say something like "Squid starts serving request
possibly before Squid loads some or all of the cache contents, if any".
Caching includes storing, loading, and serving hits. An ideal
architecture would not preclude storing and serving even if nothing was
loaded from disks yet. I believe you already document that below, but
the above sentence looks confusing/contradicting to me.

I means a bit more extreme than that. Squid being prepared to serverequests before possibly even getting to the async call whichinitializes the first cache area.We often think of cache_mem as being always present when any caching isdone, but there really is no such guarantee. Admin can already configureseveral cache_dir and "cache_mem 0". The problem is just that todaysSquid do some horrible things when configured that way as side effectsof our current design assumption.

Cache spaces are loaded as
soon as possible with priority to the faster types. But loaded
asynchronously to the startup in a plug-n-play design.

Yes. Since loading a cache requires system resources, the faster we try
to load, the slower we can serve the regular traffic while we load the
cache. Some stores will need options to control how aggressive the
asynchronous load is. SMP helps, of course, but it does not solve the
problem completely in many cases.

Also, an ideal Store should accept (or at least gracefully reject) new
entries while its storage is being loaded. There should be no global "we
are loading the cache, use special care" code in main Squid.


Agreed. This is good wording for (6) below.

1) Requests are able to be processed at all times, but storage ability
will vary independent of Squid operational status.
+ minimal downtime to first request accepted and responded
- lost all or some caching benefits at times

2) cache_mem shall be enabled by default and first amongst all caches
+ reduces the bandwidth impact from (1) if it happens before first request
+ could also be setup async while Squid is already operating (pro from
(1) while minimising the con)

Sure.

3) possibly multiple cache_mem. A traditional non-shared cache_mem, a
shared memory space, and an in-transit unstructured space.

In-transit space is not a cache so we should not mix it and cache_mem in
an "ideal design" blueprint. Collapsed forwarding requires caching and
has to go through cache_mem, not in-transit space.

So proposals for collapsables which are too large for cache_mem? or when"cache_mem 0"?

+ non-shared cache_mem allows larger objects than possible with the
shared memory.
+ separate in-transit area allows collapsed forwarding to occur for
incomplete but cacheable objects
   note that private and otherwise non-shareable in-transit objects are a
separate thing not mentioned here.
- maybe complex to implement and long-term plans to allow paging
mem_node pieces of large files should obsolete the shared/non-shared split.

Indeed. I do not see any compelling reasons to have shared _and_
non-shared caches at the same time. In the ideal design, the shared
cache will be able to store large objects, eliminating the need for the
non-shared cache. Please keep in mind that any non-shared cache would
violate HTTP in an SMP case.

You have yet to convince me that the behavious *is* a violation. Yes theobjects coming back are not identical to the pattern of a traditionalSquid. But the new pattern is still within HTTP semantics IMO, in thesame way that two proxies on anycast dont violate HTTP. The casespresented so far have been about side effects of already bad behaviourgetting worse, or bad testing assumptions.


In-transit space does not need to be shared (but it is separate from
caching as discussed above).

4) config load/reload at some point enables a cache_dir
+ being async means we are not delaying first response waiting for
potentially long slow disk processed to complete
- creates a high MISS ratio during the wait for these to be available
- adds CPU and async event queue load on top of active traffic loads,
possibly slowing both traffic and cache availability

5) cache_dir maintains distinct (read,add,delete) states for itself
+ this allows read-only (1,0,0) caches, read-and-retain (1,1,0) caches
+ also allows old storage areas to be gracefully deprecated using
(1,0,1) with object count decrease visibly reporting the progress of
migration.

6) cache_dir structure maintains a "current" and a "max" available
fileno setting.
     current always starting at 0 and being up to max. max being at
whatever swap.state, a hard-coded value or appropriate source tells
Squid it should be.
+ allows scans to start with caches set to full access, but limit the
area of access to a range of already scanned fileno between 0 and current.
+ allows any number of scan algorithms beyond CLEAN/DIRTY and while
minimising user visible impact.
+ allows algorithms to be switched while processing
+ allows growing or shrinking cache spaces in real-time

Linear fileno space prevents many optimizations. I would not require it.
FWIW, Rock store does not use linear fileno space.

Okay. So something else. A non-linear map or tree of tri-state values(quad-state, whatever). Used, open, unchecked.


The above linear is simple,  but adds sequential limits to the scan.

7) cache_dir scan must account for corruption of both individual files,
the index entries, and any meta data construct like swap.state

Yes, ideally.

8) cache_dir scan should account for externally added files. Regardless
of CLEAN/DIRTY algorithm being used.
    by this I mean check for and handle (accept or erase) cache_dir
entries not accounted for by the swap.state or equivalent meta data.
+ allows reporting what action was taken about the extra files. Be it
erase or import and any related errors.

I think this should be left to individual Stores. Each may have their
own way of adding entries. For example, with Rock Store, you can add
entries even at runtime, but you need to update the shared maps
appropriately.

How does rock recover from a third-party insertion of a record at thecorrect place in the backing DB followed by a shutdown?erase the slot? overwrite with something later? load the object detailsduring restart and use it?

For now it is perfectly possible to inject entries into UFS and COSS (atleast) provided one knows the storage structure and is willing to copewith a DIRTY restart.

Anything else?

0) Each Store (including memory cache) should have its own map. No
single global store_table and no assumptions on how a store maintains
its map. Main Squid code just needs to be able to add/search/remove and
summarize entries.


Oops. Yes. Thansk. I kind of assumed it for (1).



Thank you,

Alex.

Re: [RFC] cache architecture

Reply via email to