Benjamin Lees <[email protected]> wrote: >> Is there an actual problem you're trying to solve here? Is there any indication that spam bots are affecting your site's performance? If not, worrying about this is probably a waste of your time.
Spambots and CPU is a known issue: https://www.google.com/search?q=spam+bots+cpu&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a#hl=en&safe=off&client=firefox-a&hs=BbI&rls=org.mozilla:en-US%3Aofficial&sclient=psy-ab&q=spambots+cpu&oq=spambots+cpu&gs_l=serp.12...0.0.0.103881.0.0.0.0.0.0.0.0..0.0...0.0...1c..7.psy-ab.qbdbh4AqgLU&pbx=1&bav=on.2,or.r_qf.&bvm=bv.44442042,d.dmg&fp=742c113135fff5b3&biw=1920&bih=832 Is it a problem? Yes, they're constantly trying to break in and that increases CPU usage. I dont have any analysis to prove it but I've seen many times where traffic has been normal (google analytics) but the CPU usage has gone very high. I came from a shared server where I was actually asked to leave because of the CPU usage. I've had big problems with CPU. I'm on VPS now and it can still be a problem. Average CPU usage recently went up from around 20% to 160% (multi-core, thats why it goes over 100 or some other reason) for a few hours, while Google analytics showed no change. Whether this is a malicious/ddos bot or an advertising bot, this is something that needs to be studied and dealt with. If I stay on 200% CPU usage on the VPS I may be asked to leave the server. So yes I have to keep a watch over CPU and I have to explore all possible options to keep the usage down. I'm using caching and nginx (earlier suggestions by people on this list). As to how to prevent genuine viewers from being blocked, thats problem #2 and its something that can be improved. I'll try this suggested by Henny: http://danielwebb.us/software/bot-trap/ Anne wrote: >> +1 - there is one well-known blog site that uses capture, and I've tried as many as 10 times on a single submission, only to give up because I simply couldn't get the captcha right. Now I don't even try to comment there. They need a better captcha. But yes, you guys have reminded me that whatever method is used, I need to make sure genuine visitors are not effected. The link by Henny might take care of it as the 'hidden' link is not seen by humans. thanks Dan On Fri, Mar 29, 2013 at 1:29 PM, Benjamin Lees <[email protected]> wrote: > On Thu, Mar 28, 2013 at 11:32 PM, Dan Fisher <[email protected]> > wrote: > > > Here's one idea: If a > > certain IP address fails the captchas a specified number of times in 5 > > minutes or so, it should be banned temporarily for say, 24 hours (through > > htaccess or firewall etc). > > Humans regularly get CAPTCHAs wrong, and they often do so multiple times > (if you have any elderly relatives, feel free to see how many tries it > takes them to solve a reCAPTCHA one). Blocking them from even viewing your > site for a day seems a little extreme. > > Is there an actual problem you're trying to solve here? Is there any > indication that spam bots are affecting your site's performance? If not, > worrying about this is probably a waste of your time. > _______________________________________________ > MediaWiki-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/mediawiki-l > _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
