This writeup was taken from the write-board we started with Akara back
quite a few months ago. We tried to do what made sense for a Rails
application using page/action/fragment caching. A lot of the event
detail is fragment cached whereas the home page is page cached for non-
logged in users and fragment cached for logged-in users. Are you doing
a similar split?
Running with low db load is not invalid, many application are
attempting to do just that. I'm surprised it made such a large
difference, what percentage of hits are homepage hits for non-logged
in users?
I think Akara's idea is a good one. We should also try to stress other
components as well. I think a memcache heavy load would be interesting
to test. We can create a special branch of the rails application that
caches the thumbs in memcached, this is no problem. Currently we're
using the proxy server to serve the images, aren't you doing the same
with apache?
I was thinking of setting nginx up to use the memcached module (it is
reported to give a 4x improvement. This may also be relevant for your
tests as well. http://wiki.codemongers.com/NginxHttpMemcachedModule.
http://www.igvita.com/2008/02/11/nginx-and-memcached-a-400-boost/
As Shanti said, additional input would be greatly appreciated!
- Will
On Jan 23, 2009, at 8:56 AM, Shanti Subramanyam wrote:
Thanks Will. At this point, the PHP app is only caching the home
page. We are wondering whether to even do the Event Detail page as
the load on the database has been drastically cut down just from the
home page caching.
It's a dilemma - if we cache too much, there is no load on the db.
If we cache too little, there is nothing much in memcached. Of
course, if we run a much larger scale (say, 10's of systems for the
web tier), then I'm sure we'll see increasing load on both tiers.
But practically speaking, we need to be able to run a reasonable
configuration.
Akara has another idea to use memcached more heavily, while at the
same time not reducing the db load. Namely, cache the thumbnails in
it. This will also reduce the load on the filestore (which currently
is quite heavily stressed for the PHP app). But this strategy won't
work for the rails app will it ? I believe you're serving all
static files out of the proxy server ?
Would love to hear what others think as well.
Shanti
William Sobel wrote:
On Jan 22, 2009, at 5:11 PM, Shanti Subramanyam wrote:
Can you please elaborate on what exactly is cached ? How is the
cache managed (in terms of timeouts etc.) ?
From the original writeup:
Cache Strategy for Web20Kit
Home Page
The home page will be cached in two forms:
1. Cached as a whole page accessed by users arriving at the site
and users that are not logged on.
2. Cached as a page fragment, just for the content part. The page
will be constructed from the dynamic header which contains the user
name of the current user and the cached content fragment.
3. Paginations – these will be cached up to 5 pages. It is less
likely for users to search for events beyond the fifth page.
Expiration and re-generation
The home page will expire every 120 seconds. Then the page will be
re-generated by one of the first requests arriving after the
expiration. To prevent all requests arriving after the expiration
from re-generating, thus causing a stampede phenomenon, we will use
a lock/semaphore control mechanism as follows:
1. The home page and/or home page fragment is cached with no
timeout or a very large timeout (in the order of magnitude of days)
in memcached.
2. For each cached page, a small semaphore object is placed into
memcached with a timeout of 120 seconds – the regeneration cycle.
3. After accessing the page/fragment in the cache and sending the
response to the user, the cache client (web server) checks to see
whether the semaphore is there or has timed out. If it is not there
(timed out), the client will attempt to re-generate the page or
fragment.
4. To prevent a stampede, the client ‘adds’ a lock entry into the
cache. If the add succeeds, this thread has the lock. The lock
times out after 20 seconds using the memcached timeout mechanism.
This prevents a thread to hold a lock indefinitely.
5. After obtaining the lock, the thread generates the page or
fragment and replaces the copy in memcached.
6. Then the generating thread places a new semaphore object with
the same timeout period and removes the lock object.
Event Detail Page
The event detail page is cached as both content and, if not logged
on, the whole page as well.
Expiration and re-generation
Event detail page cache entries have a time out of 30 seconds using
the cache timeout mechanism of memcached. Thus only frequently
accessed events will remain in the cache. The load generator will
need to be designed to access event detail pages in a non-uniform
manner, too. We will use a locking mechanism for the event detail
page in a similar manner to the home page. However, we will not use
an expiry semaphore and let the page expire from the cache as a
whole. Access to the entry should however renew the expiry time so
that frequently accessed events will stay in cache. The mechanism
will work as follows:
1. The event detail page and fragment is cached with a timeout of
30 seconds.
2. As a cache client needs to access the entry, it will try to read
the entry from the cache. If the entry is available, it will extend
the cache timeout. Otherwise, the event detail page is generated
from the database.
3. To regenerate the page and prevent stampede, the client ‘adds’ a
lock entry into the cache. If the add succeeds, this thread has the
lock. The lock times out after 20 seconds using the memcached
timeout mechanism. This prevents a thread to hold a lock
inidefinitely.
4. After obtaining the lock, the thread proceeds with generating
the page. After completion, the page gets placed into the cache and
the lock gets removed from memcached.
5. If we do not get the lock (add fails). We stay in a loop, sleep
for 200ms, and check/re-check whether the page matches. We keep
checking till a timeout of 5 seconds (25 iterations).
6. The attendee list and comments/rating fragments of this page is
cached in the same manner. Those sections will be re-generated
while holding a lock object in the same manner. They will be
regenerated if the fragment is not in the cache, and on or after
updating of those fragments (i.e. somebody makes a comment or
signed up to attend this event).
Other Pages
At this point, none of the other pages and/or their fragments are
cached. Most of the other pages are accessed at low frequency with
the exception of the tag search page. The tag search page is the
next candidate for caching and pre-generation. The caching strategy
is still to be determined.
Page Caches with Ruby on Rails
Ruby on Rails does not natively use memcached for whole page
caches. It can do so with caching page fragments. Instead, it will
generate static pages as files and the request will be routed to
the corresponding file that represents a fully rendered page.
The Ruby on Rails implementation of Web20Kit will use the native
Rails mechanism for full page caches. Expirations result in a call
to remove the file and follow the same expiry policy defined for
each page, above. The file must be removed as the page cache
expires, either by a request arriving after expiry, or by a
background job.
Cheers,
- Will Sobel
Cheers,
- Will Sobel