> I'd be pretty surprised if this caching scheme is faster than
> running queries
> into MySQL, that sucker is BRUTALLY fast! If you build
> derived tables and use
> those as caches (so you just basically are doing "SELECT *
> FROM table") it
> will almost certainly exceed the speed at which you can
> deserialize the data
> with storable. For these sorts of queries where you are only
> reading and
> doing none or simple indexing its generally about 100 times
> faster than
> Oracle, and 10 to 50 times faster than postgres. This is
> especially true with
> large tables. I have a customer with 2 tables of close to a
> terabyte of data
> in them and upwards of 100 million rows. SELECT times into
> these tables on a
> dual processor PIV Xeon with 2 gigs of ram and a ciprico RAID
> 5 box are
> routinely on the order of sub 1 second.

We do use mysql. However, for our classifieds searches where we are sorting
the results by distance, the performance on the original unoptimised
query was in the order of minutes (yes, minutes even with the right
indexes). The issue being ordering data by distance.

So, by using some clever data partitioning algorithms, and only looking
at data close to you we're able to get that down to a few seconds.

However, on television, users expect fast response times, and go
away quickly if your not being quick enough, every ounce of performance
is important.

So, we work out the distances say to your first 120 results, cache the
id's and distances, then window into that cached result set and pull
back the data for say 20 items at a time which we display on screen
as 5 decks of 4 items.

Users typically on our service look at 30 pages before getting
bored. That's a long term average.

We only have to do a heavy query every 120 adverts, and a simple select
from table every 5 pages or 20 adverts.

The performance could be described as very quick at this point.

The other reason why we cache like crazy is simple - MySQL's query queue
optimizer sucks shit.

If you have a heavily load mysql server, with say 100 busy connections,
and some of those connections have simple queries and some are doing hard
queries,
(some sql queries can be over a page), then mysql will choose a simple query
over a hard query everytime. The hard queries get starved of resources and
are
placed at the back of the queue. Eventually, you run out of modperl
processes
as they are all waiting on the database blocking on hard queries. At this
point
you stop your performance benchtesting and your sql database hit's 100% for
the next
20 minutes as it clears the queue. Some queries could have been waiting 20
minutes.

The solution is to reduce the number of queries by caching everything.

Ok, the above situation happens when your hitting your webservers at silly
rates
but the nice man from Sky does have some serious testing tools. I think we
had 1,000 simumlateous
users hitting us every 20 seconds at that point on two servers. That's 25rps
per box,
off dynamically generated pages backed by a sql db.

Those figures arn't using AxKit though,  they where our previous mod_perl
app.
AxKit's figures are close to that though.

Mike.








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to