[TurboGears] Re: Open Question: Turbogears and scaling...

Lateef Fri, 17 Mar 2006 11:59:56 -0800

Most things seem to be covered exception for caching so I will share my
witless drivel about caching.
First you need to decide how dirty your data can get. If you have a
realtime stock quote system probably can't live with a lot of dirt vs
say blog comments probably don't need to be instant (well at-least the
ones I write probably don't need to be read at all). The second problem
is "the web" is stateless so you can't send updates down the socket to
the web browser. Thanks TG's widget and awesome AJAX support this can
be minimized (and there is much rejoicing, thanks!).


It doesn't matter how you "architect" the system, writes end up in one
place. Sure you can do replication but there is a world or setup,
configuration, schema changes, transactions and data synchronization oh
my! You are still only writing in once place, plus replication is just
a linear optimization, caching can get you exponential optimization
without all the lions, tigers and bears. Cache only after you have
tried to do some happy hacking to make things faster, my experience for
a couple years in EJB (oh the humanity!) was cache it and forget. Most
of the performance issues with EJB have to do with locking data so
there is no dirty data, plus the threading architecture is... Sorry I
digress often. Just cache as little as possible.

In the big project I work on we cache at a couple different levels.
Highest level is cache the page, thus skipping templates engine,
controller code, database lookup. In this project we use memcached
because it is very flexible on how we setup the caching system. No
matter what, you should create a wrapper around your caching
implementation so you an move from one type of cache to another without
changing any controller code. memcached support a number of different
languages which is nice for integration.

There are many ways to invalidate cached data. The easiest way is to
just set an expiration date. Since we are happy Postgresql users we use
a trigger system (sure features like triggers make the database slower
but I want the entire system to run faster not just the database).
There are two ways that we could do this, the first is write a python
function that gets called when an update or delete happens on cached
data. The python function in our case invalidate all the caches that
had that data. The second way is to have a trigger insert records into
an invalidation table. Every n number of seconds an external process
consumes the records by invalidating your caches (which by the way is
basically how slony works for replication).

 * Optimize code and queries first
 * Keep an abstract layer or two from the cache implementation
 * Application server scale easy as pie, database servers scale like
hernias 
 * Turbogears rules!

Good luck


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"TurboGears" group.
To post to this group, send email to turbogears@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/turbogears
-~----------~----~----~----~------~----~------~--~---

[TurboGears] Re: Open Question: Turbogears and scaling...

Reply via email to