[Agavi-Dev] Finally: Agavi has Caching!

David Zülke Wed, 07 Feb 2007 12:27:00 -0800

Hi guys,

sorry I didn't get to write this mail any sooner...


Agavi finally has caching!


It works like this:
- you create a Foo.xml in module/Blah/cache/ to make FooAction cacheable
- you put your settings and rules in there
- you lean back and enjoy the speedup


Now let's examine the options in detail.

First, these caching configs have the usual <configurations> and  
<configuration> stuff at the top level. You should maybe enable  
caching for "production" environment only since the caches the system  
writes get thrown away in debug mode anyway.

Now the top element is <cachings>, which is, as most plural tags that  
do not require an attribute, optional. In there we're getting to the  
more exciting things: the <caching> element. Let's have a look:

<caching method="read" lifetime="2 hours" enabled="false">

The "method" attribute is optional. It may contain a space-separated  
list of request method names that caching definition is valid for.  
You could set up caching for "read" for a complex form, for example,  
to speed things up a bit.

The "lifetime" attribute is optional, too. If omitted (omitted means  
omitted, not lifetime="" !), the cache will be stored forever. You  
can use any relative format allowed by the GNU date input formats  
(http://www.gnu.org/software/tar/manual/html_node/ 
tar_115.html#SEC115, http://php.net/manual/en/ 
function.strtotime.php). Examples:
"1 day 6 hours"
"3 minutes 14 seconds"
Note that you can NOT use "thursday 02:00" and other formats, because  
this would give the same day, and not the thursday of next week, on  
thursdays after 02:00 o'clock. This is a strtotime limitation, but we  
might add a fix for that in the future.

Last but not least the "enabled" attribute. Also optional, and  
defaults to true.


Inside a <caching> block, you may define the following elements:
- <groups> (can be omitted) with <group> elements inside.
- <views> (can be omitted) with <view> elements containg names of  
views to cache.
- <action_attributes> (can be omitted) with <action_attribute>  
elements containing the names of action attributes to restore when  
serving a page from cache, these will be available in the view's  
initialize() method.
- <output_types> (can be omitted) with <output_type> elements that  
define the layers, slots, request attributes and template variables  
to cache.

Important: even when served from a cache, the action and view will  
still be initialize()d! This is because a view's initialize() method  
could change the container's output type. If you need an action  
attribute for that, specify it using <action_attribute>.

WARNING: Be careful what you include in the cache. <action_attribute>  
as well as <request_attribute> and <template_variable> (more on these  
in a minute) can be used to include such items in a cache and restore  
them when the cache is hit. You might need this in case of request  
attributes to pass information to a global filter, for instance. But  
always avoid to cache objects, especially models, propel rows etc.  
The data is serialized, and in case of models, that would mean that  
the entire context, and thus ALL objects in the framework are  
included in the serialization. If you absolutely have to cache  
objects, implement __sleep() and __wakeup() methods that remember the  
context name and exclude the context itself from serialization.

Let's talk about the <view> elements first. If you don't specify any  
<view>, all views will be allowed to be cached (<views></views>,  
however, means NO views will be cached,, careful!). You can either  
give the name of an action's view as you would when returning it from  
the action, or the full name of a view including it's module:
<view>Success</view>
<view module="Admin">AddProductInput</view>

WARNING: You usually don't want to cache Error views, only Success.  
Caching Error views would mean that attackers can quickly fill the  
hard drive of your server by requesting random, invalid pages. You  
know the drill.

But now to the most important element: the <group>. Groups work like  
in Smarty, so I recommend you read http://smarty.php.net/manual/en/ 
caching.groups.php to understand the fundamental concept.

Groups may also have a source. You are not limited to a fixed string  
as a group value. You can use
- request parameters (very useful and often needed)
- request attributes (also from namespaces)
- constants
- the current locale
- user parameters
- user attributes (also from namespaces)
- user authenticated status
- user credential (or, rather, if the user has the credential or not)

I think it's best to give an example here. Let's say we want to cache  
the page for viewing a product (Default module, ViewProductAction).  
Each product has an ID, passed in via the request parameter "id". You  
have i18n in your app, so we need separate caches for each locale.  
And finally, authenticated users with the credential "reseller" see a  
special price, so we need different caches for these users:
<groups>
   <group source="request_parameter">id</group>
   <group source="locale" />
   <group source="user_credential">reseller</group>
</groups>

For product ID "3", locale "de" and non-resellers, this would put the  
cache into:

app/
        cache/
                content/
                        3/
                                de/
                                        0/
                                                Default_ViewProduct/

In reality, these directory names are base64-encoded. The last folder  
will contain a file "4-8-15-16-23-42.cefcache" containing the action  
information, and one file for each cached output type (e.g.  
"html.cefcache").

Now, it is difficult to clear such a cache (more on that later). You  
might want to cache categories. These have IDs, too. You'd get  
collisions. Not good. Besides, you cannot easily clear the cache for  
all products. So we should add another group at the beginning of the  
list:
<group>products</group>

Again, remember that the wrapping <groups> element is optional.

That was easy, wasn't it?

But that doesn't cache anything yet. We need <output_type> elements,  
too (yep, the <output_types> container is optional).

An <output_type> may have an optional "name" attribute to restrict  
the rules inside to that output type (like the "methods" attribute on  
<caching>, it can have a space-separated list of output type names).

Inside, the following elements may occur:
- <layers> (can be omitted) with a list of <layer> children defining  
which layers to cache
- <request_attributes> (can be omitted) with a list of  
<request_attribute> elements containing names of request attributes  
to store in the cache
- <template_variables> (can be omitted) with a list of  
<template_variable> elements holding names of template variables that  
are stored in the cache (and then available to all layers that aren't  
cached and thus rendered even on a cache hit

I'll explain <request_attribute> first:

<request_attribute  
namespace="org.agavi.filter.FormPopulationFilter">populate</ 
request_attribute>

Will store the attribute "populate" from the namespace  
"org.agavi.filter.FormPopulationFilter" and restore it when the cache  
is read. In our example, this attribute would contain a  
ParameterHolder object with fields to populate. Obviously, you should  
only use that for the "read" request method to fill initial values  
into your form. A bit of a stupid example, since you'll often have  
default values (like "[EMAIL PROTECTED]" or a checkbox pre-selected)  
in the template itself, but who knows. You get the idea.

Next, layers. For this example, we assume that out current view has  
two layers - "content" and "decorator".

To cache everything, you simply don't define any layers.

To cache only the "content" layer, you'd do
<layer name="content" />
Then this layer and all layers inside will be cached. The "decorator"  
layer would still be rendered on each request.

That puts you into trouble - you have the "_title" template variable  
you want to output in the decorator, in the html <title> element. But  
since the view is not executed anymore when a cache hit occurs, we  
have to tell the caching engine to store this template variable along  
the content and then restore it before the decorator is rendered.  
Easy task:
<template_variable>_title</template_variable>

Now back to the layers. Instead of
<layer name="content" />
you could also have done
<layer name="decorator" include="false" />

That would exclude the layer from the cache, and include all layers  
inside. Surprise surprise, that is actually the better way of doing  
it! Simple reason: your view _might_ insert another intermediate  
layer (let's call it "wrapper") between "content" and "decorator".  
That layer should be cached, too (at least we assume that here), but  
setting the "content" layer as the cacheable layer wouldn't work for  
that - only that layer would be cached. If we define "decorator" to  
be the last layer before the cache kicks in, though, we'll achieve  
that goal and have any layer inside "decorator" in the cache.

Now let's assume we have that setup, but we do NOT want the "wrapper"  
layer in the cache. Then we could do:
<layer name="decorator" include="false" />
<layer name="wrapper" include="false" />

But now you might ask "hey isn't that stupid, why do I have to  
declare them both as non-cacheble?", and you're right to insist:
<layer name="wrapper" include="false" />
does exactly the same thing.

BUT

What about slots?

Here's the nice thing. Obviously, slots set on a layer are included  
in the layer's cache, since the whole layer output is cached, so the  
slot output will already be included. But if you have include="false"  
on a layer, then the slots in it will NOT be included in the cache,  
since the layer is rendered each time.

But of course, we can include slots in a cache, even if their layer  
is not cacheable:
<layer name="decorator">
   <slot>menu</slot>
</layer>

This will include the "slot" layout in the cache

Note: I omitted the "include" attribute, since declaring slots inside  
a <layer> automatically sets include to false (unless you provide it  
and set it to "true", of course).

Note2: I didn't specify <slots>, but you can, if you like.

Note3: The order of layers in the configuration does not matter,  
unlike in the layout configuration in output_types.xml.

And now the above example with duplicate layers makes sense again:
<layer name="decorator">
   <slot>menu</slot>
</layer>
<layer name="wrapper" include="false" />

Of course, every layer can specify slots it wants to cache. Also, if  
you do not want to cache a slot inside the calling cache, you can of  
course set up a caching xml config for the slot action itself.

Tip: Agavi stores cookies with a lifetime, not with an expiry time.  
Thus, you can cache actions/views that set a cookie on their  
response, it's not a problem and works just fine! The same goes for  
all other response headers you set. They are all included in the  
cache and restored afterwards. You can even set a stream as the  
resource content, e.g. $this->getResponse()->setContent(fopen('/path/ 
to/image.png', 'rb')); - that file will be re-opened when the cache  
is read and, like all streams set in the response, output using  
fpassthru() for maximum performance.

I hope that helps a bit. I'm sure there will be many questions from  
now, and of course, caching will be covered extensively in the  
documentation, on which we will focus from now on. Just shoot a mail  
to the users list if find something confusing, or drop by on the IRC  
channel.

Cheers,


David


P.S: yup, RC2 is really coming tonight! It's only a matter of hours now.

_______________________________________________
Agavi Dev Mailing List
[email protected]
http://lists.agavi.org/mailman/listinfo/dev

[Agavi-Dev] Finally: Agavi has Caching!

Reply via email to