Sorry, meant to post the whole thing. :)

 For the last few weeks I've been having some problems with House of Fusion. 
The memory for the JRun.exe has been going through the roof and I didn't know 
why. The code was tight, nothing had really changed on the site, so what was 
up? The answer was Yahoo. 
In the last 3 weeks Yahoo has ramped up their indexing of sites. For a site as 
large as House of Fusion, this can take quite a bit of time. I've logged 2-4 
yahoo bot hits per second at some times. 
So how was yahoo the problem? Because of client variables. Not DB client 
variables and not even the dreaded registry client variables. Just simple 
cookie based client variables. It seems that when a client variable is set, a 
memory structure is also set for CF. Now each bot hit is assumed to be it's own 
session as it does not accept cookies. This mean each bot hit generates a 
memory structure of about 1k. Now this is not really a lot, but when you have a 
few 10's of thousands of hits from bots a day, it adds up. 
I'm still waiting on word from Macromedia as to when a client memory structure 
times out, but this seems to be the issue. 
So what's the solution? There are 4.
1. Increase your ram. If you can do this, then ramp up your memory as high as 
you can. This is not a perfect solution but it saves throwing time at the 
problem and gives you a 'buffer' against problems of this sort.
2. Set a robots.txt with a Crawl-delay setting. Mine is set to 1 second but you 
can set yours to something higher
3. set a different cfapplication for the most common bots. I use a simple 
regular expression to find key words that only exist in bots:
<CFIF 
REFindNoCase('Slurp|Googlebot|BecomeBot|msnbot|Mediapartners-Google|ZyBorg|RufusBot|EMonitor',
 cgi.http_user_agent)>
<CFAPPLICATION name="FusionA" clientmanagement="no" sessionmanagement="no" 
setclientcookies="no" setdomaincookies="no" clientstorage="Cookie">
<CFELSE>
<CFAPPLICATION name="FusionA" clientmanagement="yes" sessionmanagement="no" 
setclientcookies="yes" setdomaincookies="no" clientstorage="Cookie">
</CFIF>
This will make sure that a client structure is NOT created for one of these 
bots.
4. Use the same regex to clean out the client structure after the bot finishes 
the page. Use structclear(client) to remove the data in the onRequestEnd.cfm, 
the onRequestEnd method of the application.cfc or in the template itself.
Bottom line is that while bots are great for indexing your content, they can 
cause havoc on your system when a lot of memory is assigned to what is 
essentially a 'dead session'. 

http://www.blogoffusion.com/index.cfm/2005/11/28/pseudomemory-leak


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:225442
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to