Have you thought about using a web cache system like Coral?

http://www.scs.cs.nyu.edu/coral/

If you set the no-cache http header to true it caches pages for 5 
minutes before requesting a new copy. With a bit of experimentation it 
should be possible to track the user agent their bot uses and set the 
no-cache header only for that.

You'd probably have to do a bit of testing to make sure searches hit 
your site directly rather than the coral cache, but it might be worth a try.

Spike


Jim Davis wrote:
> I've a site that while not slow, is most likely not going to take the traffic 
> it's facing.  The site is www.firstnight.org - it will get absolutely pounded 
> on New Year's Eve.
> 
> Since this will be the warnest New Year's Eve in recent memory AND is a 
> weekend AND is a holiday for nearly everybody now I fully expect this to be 
> one of the busiest years ever.
> 
> Unfortunately we're still a non-profit.  We can't afford the kind of iron 
> that this kind of traffic would require for only one or two days a year. 
> Right now we're on a shared hosting plan at CrystalTech which we nearly 
> overran last year.
> 
> I'm just going through and looking for savings here.
> 
> 1) An average page runs anywhere from 80-300 ticks.  I plan to address some 
> of that by caching the navigation HTML (right now it's dynamically generated 
> from a cached CFC).
> 
> What do you think?  Too high?  Way too high?  Way, way too high?
> 
> Many of the pages are quite large and complex (for example the "All Events" 
> list here:  
> http://www.firstnight.org/Content/NewYears/Artists/Explore/Events.cfm?Type=All
>  ) but it's exactly those pages that are the most popular.
> 
> As an aside, you can see the current number of active sessions and the 
> current page's tick count at the bottom of any page.
> 
> 2) My session manager is worrying me.  The site doesn't use CFs built in 
> Session management.  This allows me to capture user information at the end of 
> a session, but means that I have to manually check and destroy sessions.  
> When a session ends it's saved in a database along with the pages viewed 
> during the visit, information about the user agent and several other things.
> 
> This process requires several database calls (perhaps a minimum of 8, but a 
> maximum determined by the number of pages visited) and averages in the range 
> of 40-80 ticks per session cleaned.
> 
> That would be fine, except I may be cleaning several thousand sessions at a 
> shot on the 31st.  The system is SQL Server and I've optimized it about as 
> much as I know how (there are indexes on the major columns, I've cleaned out 
> all unneeded data, etc).
> 
> Any thoughts on using multiple CFQuery statements vrs one more complex SQL 
> call?  Right now, for example, I make a call to the DB to see if the session 
> already exists, if it does I do an update, if not I do an insert.
> 
> Could it actually be faster to do an IF statement in the SQL using only one 
> CFQUERY tag?  It seems to me that with "maintain connections" on this 
> wouldn't make a difference... but I'm not sure (and want to use the time I've 
> left wisely).
> 
> I am using CFQUERYPARAM and caching what queries make sense.
> 
> 
> Some other thoughts:
> 
> I've actually considered placing some of the site on another CrystalTech 
> account (on a different server of course) and using redirects to move the 
> traffic off.  Of course there's no way I could get the URLs to stay the same 
> (outside of frames) and I wouldn't want it there all the time - just for a 
> day or two.
> 
> It would also royally screw up my log statistics.
> 
> Any other ideas for kludgy, cheap load balancing?
> 
> I can easily turn off the end-of-session handlers.  I know that this process 
> will take a while, but I'm not sure if it's really the performance hog that I 
> fear.  It is, after all, spending nearly all of it's time waiting for the 
> database - sitting effectly idle.  So what if the clean up process takes two 
> minutes if the thread isn't dominating the CPU?  (I will also decrease the 
> number of clean-ups to one every 15 minutes or so in an attempt to clear out 
> old sessions as quickly as possible.)
> 
> Many (sometimes VERY many) of the sessions in memory are generated from bots. 
>  I'm considering creating a ROBOTS.TXT that would prevent bots from indexing 
> the site for our busy time, but I fear that would inhibit them more than I 
> want - if you tell robot's to bugger off do they come back?
> 
> I'm open to any other ideas.  My gut says that with the resources we have 
> we'll just have to live with overloads unless they want to create a much 
> simpler site (and they don't want to do that).
> 
> Anybody got some heavy iron and bandwidth they're willing to donate for three 
> days a year.  ;^)
> 
> Sorry for the babbling - I'm entering my normal end-of-year paranoid phase.
> 
> Jim Davis
> 
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Special thanks to the CF Community Suite Silver Sponsor - CFDynamics
http://www.cfdynamics.com

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:188890
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to