#8794: Profanity filter suffers from the Scunthorpe problem
---------------------------------------------------+------------------------
Reporter: Daniel Pope <[EMAIL PROTECTED]> | Owner: nobody
Status: new | Milestone:
Component: django.contrib.comments | Version: SVN
Keywords: | Stage: Unreviewed
Has_patch: 0 |
---------------------------------------------------+------------------------
The implementation of the profanity filter suffers from the
[http://en.wikipedia.org/wiki/Scunthorpe_Problem Scunthorpe Problem]; ie.
that it considers the town of Scunthorpe, amongst other innocuous words,
to be profane.
Profanity filtering is A Hard Problem, and naïve solutions like this one
cause frustrating problems to end-users.
Checking the current profanities list for false positives in a couple of
word lists I had to hand also yields:
{{{
gobbledegook
snigger
Brushite
Cushite
Niggerhead
Peshito
Peshitto
Shittah
Shittah tree
Shittim
Shittim wood
Shittle
Shittlecock
Shittleness
}}}
Obviously proper names are not in my dictionary, but they cause frequent
and often more annoying problems.
I suggest to disable the filter by default so that scope of the problem is
limited, and at the very least the filter must be restricted to
{{{re.match(r'\b' + word + '\b')}}}. Users who need stricter profanity
filters should have the responsibility for doing so, and potentially
annoying their users themselves. Django should not be doing it for them.
--
Ticket URL: <http://code.djangoproject.com/ticket/8794>
Django Code <http://code.djangoproject.com/>
The web framework for perfectionists with deadlines
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---