|
I'm having problems getting rid of spam
listings.
In particular porn.
I've come up with a list of words and a series of
SQL statements to check for their occurencs in urlwordsXX, etc etc, but there
must be a better way. "-" in the query won't do it either as these people
are very crafty. Besides, you can't have a query with hundreds of
'minused' words.
Why isn't there a very simple way to eliminate
sites via a "bad word" list? Note I'm not talking about prior to
indexing. I'm talking about post index. Adult word
filter.
Also, even after I've eliminated all traces of
%porn%, %Porn%, and %PORN% from the database via a comparision query to
urlwords00 - urlwords15 (title, description, keywords), I still have thousands
of websites with %porn%, %PORN%, and %Porn%, albeit none of the remaining
websites have that in their title, description, or keywords, so at least my
'cleaning' is almost working.
Where is this string occuring then if not in title,
description, or keywords?
Note that for testing purposes all I did was create
two webspaces: one porn free (%porn% not found in keywords, description,
or title) and the other only porn. When I search the porn free space,
I STILL have occurences of the above string.
jp
Santiago
|
- Re: [aseek-users] Problems with Spam Search Engine li... John Pinochet
- Re: [aseek-users] Problems with Spam Search Engi... John Pinochet
- Re: [aseek-users] Problems with Spam Search ... Kir Kolyshkin
- Re: [aseek-users] Problems with Spam Sea... John Pinochet
- Re: [aseek-users] Problems with Spam... Alexander F Avdonkin
- [aseek-users] Exact usage for index -f John Pinochet
- Re: [aseek-users] Exact usage for index -f Alexander F Avdonkin
