Re: Python based unacceptable language filter

2005-10-04 Thread Frithiof Andreas Jensen

David Pratt [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Hi. Thank you for the links. I am looking for something that would
 function in a similar way to Yahoo's filter for it's message boards.
 Perhaps I should have used the term profanity instead of unacceptable
 language. I am not concerned about correcting sentence structure or
 poor grammar.

 Yo melonfarmer, you should watch the non-profane version of Repo-Man for
inspiration -  ;-)

PS:

Sites like The Profanisaurus provides plenty of workarounds for most of
the filters out there.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python based unacceptable language filter

2005-10-03 Thread Frithiof Andreas Jensen

David Pratt [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Hi.  Is anyone aware of any python based unacceptable language filter
 code to scan and detect bad language in text from uploads etc.

 Many thanks.
 David

Look up Spambayes - if you can filter on terms like dear friend you can
filter on the inverse too, no? It needs samples to work with.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python based unacceptable language filter

2005-10-03 Thread Andrew Gwozdziewycz

On Oct 2, 2005, at 9:45 PM, Nigel Rowe wrote:

 David Pratt wrote:


 Hi.  Is anyone aware of any python based unacceptable language filter
 code to scan and detect bad language in text from uploads etc.

 Many thanks.
 David


 You might be able to adapt languagetool.
 http://www.danielnaber.de/languagetool/features.html

 Later versions have been ported to Java, but the old python version of
 languagetool is at http://tkltrans.sourceforge.net/#r03

 His thesis paper is at
 http://www.danielnaber.de/languagetool/download/ 
 style_and_grammar_checker.pdf

 Mind you, given the poor language skills of many native english  
 speakers
 (not to mention those for whom english is a second language)  
 relying on
 automated filters to enforce 'good' language seems a trifle  
 extreme.  This
 post for example would probably not pass.

 Cheers,
 Nigel

 PS. For the humour impaired, this g*d d*mm post was a f*cking joke,  
 OK! :-)

 Mind you, the links are real.

 -- 
 Nigel Rowe
 A pox upon the spammers that make me write my address like..
 rho (snail) swiftdsl (stop) com (stop) au
 -- 
 http://mail.python.org/mailman/listinfo/python-list




I think he may be referring to bad words, and 'filthy' language. At  
least that's what i got from the question.
There are many PHP implementations on the web, which could be adapted  
to python fairly easily. Most of which are probably not the most  
ideal solution and
involve alot of stuff like

   for n in badwords:
 texttofilter.replace(n, 'bad word deleted')

If that's all you need though, maybe it's not so bad.


---
Andrew Gwozdziewycz
[EMAIL PROTECTED]
http://ihadagreatview.org
http://plasticandroid.org


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python based unacceptable language filter

2005-10-03 Thread David Pratt
Hi. Thank you for the links. I am looking for something that would  
function in a similar way to Yahoo's filter for it's message boards.  
Perhaps I should have used the term profanity instead of unacceptable  
language. I am not concerned about correcting sentence structure or  
poor grammar.

I realize screening profanity can be accomplished by simply looping  
over regular expressions from a database or dictionary to search the  
text to check against possibilities .  I thought it possible that there  
may be something like this already in existence, perhaps already in a  
module since it is likely (despite how absurd) - that someone has  
developed a dictionary of profane word expressions I suspect. What's is  
perhaps more crazy, is that one has to consider including something  
like this in an application - but you have to conclude the Internet is  
what it is.

Regards
David

 From Yahoo:
The Profanity Filter allows you to control how you want to view  
messages with profanity in two ways. You can choose to view the  
messages with the profanity masked with italcized symbols (@$% ), or  
you can have the messages containing profanity hidden entirely.

You can also choose between a weak setting for exact word matches or a  
strong setting that will filter spelling variations.

Well I know this thread is a

On Sunday, October 2, 2005, at 10:45 PM, Nigel Rowe wrote:

 David Pratt wrote:

 Hi.  Is anyone aware of any python based unacceptable language filter
 code to scan and detect bad language in text from uploads etc.

 Many thanks.
 David

 You might be able to adapt languagetool.
 http://www.danielnaber.de/languagetool/features.html

 Later versions have been ported to Java, but the old python version of
 languagetool is at http://tkltrans.sourceforge.net/#r03

 His thesis paper is at
 http://www.danielnaber.de/languagetool/download/ 
 style_and_grammar_checker.pdf

 Mind you, given the poor language skills of many native english  
 speakers
 (not to mention those for whom english is a second language) relying on
 automated filters to enforce 'good' language seems a trifle extreme.   
 This
 post for example would probably not pass.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python based unacceptable language filter

2005-10-03 Thread gene tani
Good question, but Y'know, i don't think i'm the only one using a
threaded mail reader.  Pls don't hijack others' threads.

David Pratt wrote:
 Hi.  Is anyone aware of any python based unacceptable language filter
 code to scan and detect bad language in text from uploads etc.
 
 Many thanks.
 David

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python based unacceptable language filter

2005-10-03 Thread Erik Max Francis
Andrew Gwozdziewycz wrote:

 I think he may be referring to bad words, and 'filthy' language. At  
 least that's what i got from the question.
 There are many PHP implementations on the web, which could be adapted  
 to python fairly easily. Most of which are probably not the most  
 ideal solution and
 involve alot of stuff like
 
for n in badwords:
  texttofilter.replace(n, 'bad word deleted')
 
 If that's all you need though, maybe it's not so bad.

This is a no-op, since it replaces the text, but then discards it.  You 
meant:

for badWord in badWords:
textToFilter = textToFilter.replace(badWord, ')!%(#)%')

-- 
Erik Max Francis  [EMAIL PROTECTED]  http://www.alcyone.com/max/
San Jose, CA, USA  37 20 N 121 53 W  AIM erikmaxfrancis
   If anything is sacred, the human body is sacred.
   -- Walt Whitman
-- 
http://mail.python.org/mailman/listinfo/python-list


Python based unacceptable language filter

2005-10-02 Thread David Pratt
Hi.  Is anyone aware of any python based unacceptable language filter 
code to scan and detect bad language in text from uploads etc.

Many thanks.
David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python based unacceptable language filter

2005-10-02 Thread Nigel Rowe
David Pratt wrote:

 Hi.  Is anyone aware of any python based unacceptable language filter
 code to scan and detect bad language in text from uploads etc.
 
 Many thanks.
 David

You might be able to adapt languagetool. 
http://www.danielnaber.de/languagetool/features.html

Later versions have been ported to Java, but the old python version of
languagetool is at http://tkltrans.sourceforge.net/#r03

His thesis paper is at
http://www.danielnaber.de/languagetool/download/style_and_grammar_checker.pdf

Mind you, given the poor language skills of many native english speakers
(not to mention those for whom english is a second language) relying on
automated filters to enforce 'good' language seems a trifle extreme.  This
post for example would probably not pass.

Cheers,
Nigel

PS. For the humour impaired, this g*d d*mm post was a f*cking joke, OK! :-)

Mind you, the links are real.

-- 
Nigel Rowe
A pox upon the spammers that make me write my address like..
rho (snail) swiftdsl (stop) com (stop) au
-- 
http://mail.python.org/mailman/listinfo/python-list