I'd suggest using an integer as a "fixed-point" value. Let 100 or
1,000,000 or some value represent "1".
-- Sean
Dave wrote:
On 12/4/06, Sean Gilligan <[EMAIL PROTECTED]> wrote:
One quick thought: It might be good to return a fixed-point value from
the validate function. Some validators could use a Bayesian filter (or
other technique) that is going to return a probability that a given
comment is spam. If may even possible to add (or use a weighted average
of) the returned values. If both the Bayesian and the link count
filters return reasonably high likelihood of spam, that could have a
cumulative effect. Of course, a filter that is "certain" that something
is spam could always return "1.000"... You could also have something
like:
0 - 0.25 = Publish
0.25 - .75 = Moderate
.75+ = Reject (but possibly archive?)
That's an interesting idea and it doesn't really cost us anything to
use a float rather than a boolean as the return value.
- Dave
Users could even adjust the threshold of their filter based upon how
much time they are willing to spend moderating.
[A funny aside: A blogger (I think it was Jeff Jarvis) was accused of
"censoring" comments about "socialism". He finally realized that the
word "socialism" contained the word "Cialis" which is a Viagra-like
drug, and was triggering his filter.]
Let's see if the mailing list lets /this/ message through...
-- Sean
Dave wrote:
> Currently, we've got a couple of different ways to control comment
> spam in Roller.
>
> * Three levels of blacklist: comments that match blacklist are
> marked as spam
> o Built in blacklist: based on old unsupported MT blacklist
> o Site wide blacklist: global admin manages this blacklist
> o Website blacklist: each weblog can define a blacklist
>
> * Comment moderation: when enabled, comments must be approved by
> blog owner
>
> * CommentAuthentcator: determines if user is allowed to comment
> o You can plugin your own by implementing the comment
> authenticator interface
> o Default authenticator does nothing
> o Math Authenticator presents math question, verifies answer
> o CAPTCHA authenticator is possible too, but we don't ship
one
>
> * Comment throttle: IP addresses that send rapid-fire comments are
> banned
>
> There are problems with each of those methods and even when combined
> they're not enough to control spam. We've discussed other ideas for
> comment spam control like forcing long comments into moderation,
> rejecting comments with too many links and rejecting comments judged
> by Akismet to be spam. Those are all good ideas, but if we start
> adding special rules ad hoc, we'll end up with a mess.
>
> What we need is way for Roller site administrators to define a chain
> of comment validators so that we and others can add comment spam
> processing rules, which are then treated in a uniform way in the
> Roller comment servlet.
>
> Read the rest here:
>
http://rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_CommentValidators
>
> Pease respond with comments here on the list.
>
> - Dave
>
>