Re: CAPTCHA handling -- quick update

Andrew Jaquith Thu, 07 Jan 2010 06:54:18 -0800

Janne,

I picked the really nice option.  :) The solution is that when a post
contains spam, we redirect to the editor page, but request a CAPTCHA
be displayed. Re-editing is allowed.

Here is how it works. There are two collaborating parts: the
SpamProtectTag and the SpamInterceptor. This is where we do a little
magic. :)

Let's say you've loaded the editor for the first time (i.e., you
haven't submitted). What we do is write out a special parameter, a
"challenge request," when SpamProtectTag executes. The contents, for
the FIRST GET, contain the string value of the enum
Challenge.Request.CHALLENGE_ON_DEMAND. This means "no CAPTCHA is
required, but when we interpret the post, get ready to generate one
after redirect if there's spam in it." Then, we encrypt the parameter
using CryptoUtil.

When SpamInterceptor intercepts the POST, we then look for the special
challenge-request parameter. Two things can happen: a normal user
submits (in which case the challenge-request parameter will be there),
or s spammer submits (in which case it will not be).

In the normal case, we extract the challenge-request parameter,
decrypt the contents and figure out that its value was
CHALLENGE_ON_DEMAND. Because it has this value, we do NOT run the
Captcha validator. We always run the content Inspection. If it
contains spam, we add a ValidationError. If not, we return a null
Resolution, the "save" event method executes further down the chain,
and we are done.

Now, let's look at the spammer case.

If the challenge-request parameter is not present in the request, we
KNOW that the user has been naughty, or that it is a spammer. So we
add a ValidationError and redirect to the editor again.

On the second GET (i.e., after the POST and redirect back to the
editor page), the SpamProtectTag executes again. This time, it knows
there was spam because of the ValidationError, and this time will
write out the enum Challenge.Request.CAPTCHA, which means "I just
rendered a CAPTCHA, and when SpamInterceptor intercepts the post,
validate it." Thus, when SpamInterceptor handles the post next time
around, when it sees the CAPTCHA value it knows that it should do the
CAPTCHA check.

(and then we lather, rinse, repeat until the user submits a correct
CAPTCHA value)

That might sound complicated, but it's not -- the code is dead simple.
The key is that the SpamProtectTag writes the current state out to the
challenge-request parameter: CAPTCHA_ON_DEMAND is written out for the
first-time GET, and on subsequent GETs, CAPTCHA will be written out if
the contents are spam. All SpamInterceptor needs to do is obtain what
the state was by retrieving and decrypting the challenge-request
param.

There is one other wrinkle here, which is if we see the SpamProtectTag
attribute "challenge" in the JSP, when the JSP author wants to force a
password check or a CAPTCHA in all cases. In that case, we will write
out the value Challenge.Request.CAPTCHA or Challenge.Request.PASSWORD
and render the Challenge right away, even on that first post.

Naming-wise, I've gone back and forth about what the right names for
everything should be. At the moment, I think Challenge.Request might
better be called Challenge.State. :) Maybe CAPTCHA_ON_DEMAND becomes
CHALLENGE_NOT_RENDERED, CAPTCHA becomes CAPTCHA_RENDERED, PASSWORD
becomes PASSWORD_RENDERED? Not sure. But,

Oh, and one more thing. This basic technique -- encrypt some sort of
state object, write it out as a hidden parameter to the form, then
extract/decrypt on POST -- is something I gleaned from looking through
the Stripes code. They do a lot of "state smuggling" as an alternative
to storing server-side session attributes. I think it's a nice,
low-overhead technique for situations like forms, which are
essentially stateful. I use this technique also for smuggling the
parameter names used for the spam tokens, for example.

Long post! Hope it made sense.

Andrew

On Thu, Jan 7, 2010 at 3:26 AM, Janne Jalkanen <[email protected]> wrote:
>
> Errr... How do we determine what is a previous post? Spambots tend to make
> each request from a  different address and ignore cookies. Or is it so that
> if the post is determined to contain spam, you get a redirect to the editor
> page, but this time with a captcha? 'cos that would be really nice, since it
> allows you to re-edit the content.
>
> /Janne
>
> On Jan 5, 2010, at 18:10 , Andrew Jaquith wrote:
>
>> Small correction (this is what happens when you type too quickly) --
>>
>> CAPTCHAs are rendered, by default, ONLY if the previous post contains
>> spam. The missing "only" makes all the difference. :)
>>
>> The important point is that we are treating spam, essentially, as a
>> form validation error.
>>
>> If you don't submit spam, it won't produce a validation error, so you
>> won't see a CAPTCHA. (Unless the JSP requires it, for example, when
>> creating a user account).
>>
>> Andrew
>>
>> On Tue, Jan 5, 2010 at 10:46 AM, Andrew Jaquith
>> <[email protected]> wrote:
>>>
>>> Hi all --
>>>
>>> Just thought I'd send a quick update on CATPCHA. Janne and I have had
>>> some back-channel conversations about enhancements that I needed to
>>> make.
>>>
>>> Functionally, here's how the revised system will work:
>>>
>>> - CAPTCHAs will be rendered on the same page as the submitting form,
>>> but by default if the previous post contains spam (this is in line
>>> with Janne's comments)
>>> - CAPTCHA-rendering will be the responsibility of the wiki:SpamProtect
>>> tag (as before)
>>> - wiki:SpamProtect must be added as a child of a form or stripes:form
>>> element (as before)
>>> - If the JSP author wishes, they may require a CAPTCHA by adding an
>>> attribute challenge="captcha" to the SpamProtect tag (new)
>>> - In addition, a form can require password confirmation by adding
>>> attribute challenge="password" to the SpamProtect tag (new)
>>> - All of the back-end processing will be done by SpamInterceptor, in
>>> collaboration with the content-inspection system (as before)
>>> - Stripes ActionBeans that require spam protection need only add a
>>> @SpamProtect annotation to the target event methods (as before)
>>>
>>> We will add the SpamProtect tag to the page-edit form, comment form,
>>> new user registration form, and user profile form. For new user
>>> registration, a CAPTCHA will likely be required (challenge=captcha).
>>> For user profile changes and post-install wiki configuration (coming
>>> soon!), the user's password will be required to confirm
>>> (challenge=password).
>>>
>>> So, that's the functional design -- nice and simple. And we knock out
>>> some JIRA bugs while we're at it (e.g., confirm password for account
>>> changes)...
>>>
>>> Andrew
>>>
>
>

Re: CAPTCHA handling -- quick update

Reply via email to