On Jun 22, 2007, at 13:56 , Chris Miller wrote:

I see how that by storing database results in memcached would be very helpful, but how does memcached know when the result set in cache has changed?

        Long answer:

I wrote an app called diggwatch[0] that uses the digg API as my primary data store, and stores all the useful information in memcached locally. Cache misses for me are really expensive, and the digg API makes certain operations I want to perform somewhat difficult.

For example, the primary thing I wanted this app to do for me is tell me when anyone responds to any comment I make on a digg article. Basically, that looks like this:

        1) Ask for any recent comments by username.
2) Ask for all of the stories to which any of these comments belong so I can put useful titles on things. 3) Ask for any children comments of #1 (or children of the comments' parent as defined by the old system).

As this is primarily used (at least by me) as an RSS provider, that request occurs several times throughout the day and I'd like it to be cached. However, I'd *also* like it to be fresh, and I don't get notifications from digg.

I cache the result of #1 for about a minute -- fairly insignificant amount of time, but I don't consider that request version expensive.

I cache the results from #2 for about five minutes. It's a single request for up to something like 100 stories, and I can optimize some of it out if I have some of the stories in my cache already.

#3 is the most expensive query, because I need to run it almost once per comment (result of #1). I cache these for about a day, *but* the key includes the number of comments on a given story (which I get in the result of #2). If nobody's commented on a story at all, I can be guaranteed that nobody's commented on a thread I'm involved in within the story.

It's not perfect, but it's quite effective and greatly reduces the number of trips to digg without having my latency drop below ~5 minutes.


        Short answer:

Depends on your application, but don't think of it as working with result sets as much as objects. I cache collections of pre-build objects, and mash them together in my application code.

A neat benefit of doing things this way (going back to the long answer above), is that understanding my data at this level allows me to generate smarter etags such that the typical response sent to an RSS reader from my app is 0 bytes (after headers).


[0] http://bleu.west.spy.net/diggwatch/

--
Dustin Sallings


Reply via email to