RE: Caching Queries

Tim Stadinski Fri, 02 Nov 2001 20:33:52 -0800

there are multiple methods of caching techniques, some harder to implement
than others. what you always want to keep in mind when deciding whether or
not you need to cache, there are a three fundamental pieces involved in the
time it takes for a web user's page request to finish downloading to their
browser:


1.      The web system serving the page to completion
2.      The size of the page
3.      The data being data transmitting over the internet

(You can't control how fast the user is connected to the net, but you
certainly can control the speed and size of the page being served.)

Page Speed vs Page Size

Page Speed: pages that retrieve data and display it to the user should load
in 100ms or faster under normal circumstances.  Pages that update the
database, write files to the file system, or have unusual requirements may
take longer, but normal pages in your site should take 50-100ms or even
less.  Some may think that tweaking for the utmost efficiency is overkill,
that there isn't much of a difference between 75ms and 150ms.  In many
low-load situations that's right.  But if you expect your system to get any
degree of traffic, every millisecond counts.  The bottlenecks in your system
will only worsen with load, and your system will grow slower and slower with
increased load.  Another way of looking at the 75ms vs. 150ms. is that it's
twice as fast, therefore can support about twice as many users.

Page Size: pages should be less than 30Kb of data and graphics unless you
have some overriding circumstance (eg. Java applet, activex object,
important graphics).  Don't make your page huge by overloading it with huge
graphics that slow it down.

A robust web application needs to control how data is loaded from the
database and cached on the web application server.  The best possible place
to serve information from is Cold Fusion's memory.  Remember this fact, as
most improvements in speed follow it.  If you know that the data served up
on the last page hit is the same as the data served up this hit, you don't
need to go across the network to the database server again for the same
information.  This can be done with: query caching, cached structures, or a
hybrid approach.  Your overall Cold Fusion system should be built to scale
out, not scale up.  What this means is that you should prepare yourself to
respond to more load by adding more machines, not by adding more processors
and memory to the existing machines.  Many low end servers can be more
reliable, faster, and cheaper than a few ultra-powerful Sun or Unix
machines.  To build a system to scale out, the goal again is to serve as
much information from the machine's memory, thus displacing the load off the
database and speeding up the whole system.  Some techniques to cache data to
the web server's memory are described below.

QUERY CACHING
What this means is that you tell Cold Fusion that the query that's
previously been submitted is still good, and you don't need to go back to
the database again.  You can either cache a query within a future timespan
(cachedwithin), or permanently after a certain date (cachedafter).  You
simply add this attribute to a CFQUERY tag to accomplish this result.  Daryl
Banttari from Allaire has a good article
(http://www.allaire.com/handlers/index.cfm?ID=17552) that explains how to
cache queries and the performance gains you can expect.

One of the major problems with query caching is if you have a multi-server
(clustered) web server environment without sticky sessions on.  In this
example, a problem arises if the user:
1.      logs on to web server A, which caches their preferences
2.      gets bounced to web server B, which also caches their data
3.      changes something while on server B
4.      gets bounced back to server A.  Their data is now out of date!

The only way to fix this is with client variables, or a clever system of
cookies that alert you to when the user has last updated their preferences.

STRUCTURES
Structures are a way to load information into the web server's server or
application scope.  
Upon startup, you initialize structures that persist for the life of the
server, or the life of the application variable timeout.  As needed, you
program the code to update the structures and query them for data as needed.
When users log on, you load their information as structure keys into the
master structure.  Every time they update something, you update both the
structure and the database.

The problem with this approach is again, the user having the chance to
update the information.  If they can, and you have a multi-server
environment, you have to make sure the data is up to date across all
servers, which would require a replication system.  Replication isn't a good
idea, as it requires more server page views, and diminishes the effects of
caching for scalability.  Also, information that's organized well in a
recordset doesn't translate well into structures.  Structure keys are stored
randomly; you can't assume the proper order as you can with a query and an
ORDER BY clause.

HYBRID APPROACH
The basic premise is that you have data cached in the server's memory, and
you keep checking to see that it's in synch with the database.  If it is
current, you use it.  If it's out of date, you refresh it and keep going.
What you do is take all the queries you need for a user or other object, and
cache the queries themselves by setting them as variables in structure keys.
This is just like cached queries, as you are storing a query type of
variable.  It doesn't actually used "cached queries" in the traditional
sense however; you are manually caching them yourself.  This allows you to
not run into the sorting problem introduced with the structures approach.
The hybrid approach also negates the issue of running into the cached query
limit for that server, as there aren't any real cached queries.

I could go into more detail on the hybrid approach, but that takes more time
and is not easy to implement.  If you are really intrested, email me
personally ([EMAIL PROTECTED]).  But in summary, I hope this helps you
think about a few techniques of caching queries and why you would need to do
that.  It solely depends on the complexity and needs of your application.

Cheers,

Timothy Stadinski
Senior Software Engineer
Afternic.com
[EMAIL PROTECTED]

-----Original Message-----
From: Roel [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 02, 2001 11:02 AM
To: CF-Talk
Subject: Re: Caching Queries


I've heard of someone that looks at the execution time of a query, when it
exceedes a certain standard the query's will be cached (at a busy time or
something like that). I'ts a way to cache it when you need it


----- Original Message -----
From: "Carlisle, Eric" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Thursday, November 01, 2001 9:03 PM
Subject: Caching Queries


> An application I am working with is really bogging down a database server.
> Can anybody point out an online resource to the pros/cons of caching
> queries?  Maybe I'm making this more than it is and it's not that
> complicated.  I'm just wondering if caching queries is an eventuality for
> web applications that get a lot of traffic.
>
> Thanks,
>
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Structure your ColdFusion code with Fusebox. Get the official book at 
http://www.fusionauthority.com/bkinfo.cfm
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

RE: Caching Queries

Reply via email to