There are a few fields that SpamFilter uses.

First, those inserted by SpamFilter.insertInputFields():

1) An UTF-8 detector. We put a single non-Latin1 character in the form to make sure the input is not mangled by some badly behaving robots or user clients. This used to be a major problem a few years back, and this solved all of those "hey, my edit destroyed all UTF-8 characters" -problems. It also turns out many of the older bots just assume a form is Latin1.

What is currently in plain.jsp (and should actually be moved to insertInputFields()):

2) An input field with a random name. This means that a bot will need to actually GET the form first and parse it out before it can send syntactically correct POSTs. This is a LOT more effort than just simply looking at the fields once and crafting your auto-poster to conform.

3) A hidden input field which is meant to catch those bots which do a GET and then randomly fill all fields with garbage. This field MUST be empty when SpamFilter examines the contents of the POST. Since it's hidden with the use of CSS, the bot would need to understand CSS to bypass this one (and, the fieldname is also randomized in order to prevent someone from hardcoding the fact that it needs to be empty).

The idea between 2 & 3 is that we've got two fields with random names, one of which needs to be empty and one of which needs to be filled with pre-determined data. This is quite hard for most bots to catch, unless they are specifically crafted for JSPWiki and contain some amount of logic to figure this one out. In the future we could also do full input field name randomization and even random reordering of the input fields to make it even more difficult.

(Once they've passed these simple tests, then the content-based analysis starts. But these are sufficient to catch ~95% of all spam.)

/Janne

On Mar 4, 2009, at 07:46 , Andrew Jaquith wrote:

Janne,

Could you give me a little background on how the SpamFilter hash
fields work in 2.8? I've been able to replicate most of the behavior
of the plain editor as ActionBean event handlers & Stripe-ified JSPs,
but I haven't done so with the spam-filtering/hash fields just yet.

My primary reason for asking (other than trying to understand how it
works): I'm wondering if there's something we might be able to do here
that is related to CSRF prevention.

Andrew

Reply via email to