Re: [Cherokee] Spam in Cherokee's Trac

Eric S. Johansson Sun, 21 May 2006 05:42:53 -0700

Alvaro Lopez Ortega wrote:

  It looks really interesting, but we do need a solution for this
  right now. I've had to add two more rewrite rules today because we
  are still being spammed by some morons. :-//


  So, the only way I can think of is to write some sort of script in
  order to access the trac.db file directly.. it'd be an awful work
  around though.

the only other solution I can think of at this time (and I do hate thisparticular technology) is a CAPCHA barrier. This will not defendagainst the Third World cyber café attack but it will defend againstautomated or handicapped person attack. But I have a sneaking suspicionyou don't have many blind users and as far as I know, I'm the only upperlimb impaired one.

there are many CAPCHA add-ons in different languages so you should beable to find one you can use in TRAC.

*if* you wanted to go with the proof of work solution, we would needsomeone who really knows browser applets in Java in order to do it atall well.

in this context, every request would have a "number of bits required"indicator hidden variables, headers or cookies. The Java appletgenerating a token would run in background as the user does what ever itis they're going to do. On submission, if the token is not complete,then give the user a message about preparing anti-spam response, pleasebe patient, don't hit reload etc. etc.

all responses without an appropriately sized tokens should betossed/reject etc. otherwise, take them and use them.

A refinement of this is necessary when spammers start generating tokens.In a beginning, spammers won't generate tokens because they don't knowabout them, they have other, higher value targets they can go after forlower-cost which means you can keep your stamp values low and minimizethe impact on legitimate users. Eventually however spammers will catchon and at that point it would be necessary to implement a differentialrate system.

A differential rate system (a.k.a. reputation proxy) counts on theability of the recipient to differentiate between token generators.Legitimate token generators get smaller demands for token sizes as theydevelop a reputation for being "good people". Unknown token generatorsare charged the standard amount until they to develop a reputation.Known bad token generators are charged increasingly higher amounts.this ability to differentiate between user classes allows you to chargemuch higher rates to attackers/spammers and correspondingly lower ratesto legitimate users. In other words, you reward users that report bugsfrequently and penalize those who report bugs once in awhile. Thisshould have the net effect of increasing the number of bug reports. ;-)

it's important to practice a bit of subterfuge and make token generatorswith a bad reputation think their data has been accepted when inreality, it's been discarded or subject to human review.

Identification of token generators can be difficult. IP address is theeasiest but address/user associations are not as stable as they hadbeen. Client-side embedded identification can work as long as theidentifier can't be replicated among attackers and reused (i.e. paralleldouble spending). this would take a bit of thinking before coming upwith a solution. But even simple IP address blocks would be of some use.

that's the technique in a nutshell. And I have three more to write upand make slides for (not including the 20 odd slides with fundamentalsand application of proof of work systems). :-)


---eric
_______________________________________________
Cherokee mailing list
[email protected]
http://www.0x50.org/cgi-bin/mailman/listinfo/cherokee

Re: [Cherokee] Spam in Cherokee's Trac

Reply via email to