Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2013-08-03 Thread Christopher Done
Anyone ran SpamAssassin on the offending content created by the spammers?
I've been using it on hpaste and it's been very effective at cutting out
the crap.


On 4 August 2012 19:15, Gwern Branwen gwe...@gmail.com wrote:

 On Fri, Aug 3, 2012 at 10:34 PM, damodar kulkarni
 kdamodar2...@gmail.com wrote:
  So, another doubt, if detecting spam is trivial, then why not just send
 the
  detected spam to trash directly without any human inspection?
  This may mean some trouble for the posters due to false positives; but
 the
  moderator's job can be reduced to some extent.

 Which is pretty much what this whole thread is about: asking that the
 sysadmins Do Something about this trivial yet overwhelming spam.

 --
 gwern
 http://www.gwern.net

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-08-04 Thread Gwern Branwen
On Fri, Aug 3, 2012 at 10:34 PM, damodar kulkarni
kdamodar2...@gmail.com wrote:
 So, another doubt, if detecting spam is trivial, then why not just send the
 detected spam to trash directly without any human inspection?
 This may mean some trouble for the posters due to false positives; but the
 moderator's job can be reduced to some extent.

Which is pretty much what this whole thread is about: asking that the
sysadmins Do Something about this trivial yet overwhelming spam.

-- 
gwern
http://www.gwern.net

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-08-03 Thread Gwern Branwen
On Mon, Jul 30, 2012 at 6:59 PM, Alexander Solla alex.so...@gmail.com wrote:
 We could even have a report spam button on each page, and if enough users
 click on it (for a given revision), the revision gets forwarded to a
 moderator.

This would be useless. The problem is not detecting spam, since that's
quite trivial: it's very hard to miss. The problem is that the
moderator (ie. me) is already overworked. The spam needs to be reduced
to begin with, not detected.

-- 
gwern
http://www.gwern.net

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-08-03 Thread damodar kulkarni
Hi Gwern,
First of all, thanks for your patience.

I am willing to do administrator tasks.

  4. ReCAPTCHA enabled for 'edits adding new, unrecognized external
 links' - which is all of the spam.


 This is already enabled.


I guess the problem may be due to
ReCAPTCHAhttp://www.google.com/recaptcha/learnmore;
so you can choose to use a custom built CAPTCHA that is more difficult to
crack.
You may find some open source captcha systems better than the ReCAPTCHA.
http://jcaptcha.sourceforge.net/

To forge the relay attacks on CAPTCHA, you may try early timeouts and/or
increasing length of CAPTCHA text.
This potentially may mean more trouble and nuisance to legit users, but I
guess, the Haskellers will be willing to pay this small price for a
better web-site experience for them. :)

Relay attacks: Remember that there are human solvers employed in countries
like India, China, so any human solvable captcha will fail to work as
desired.

http://en.wikipedia.org/wiki/CAPTCHA#Human_solvers


The problem is not detecting spam, since that's
 quite trivial: it's very hard to miss.


Thanks for providing more info.

So, another doubt, if detecting spam is trivial, then why not just send the
detected spam to trash directly without any human inspection?
This may mean some trouble for the posters due to false positives; but
the moderator's job can be reduced to some extent.

I hope, this is useful. If not, please forgive me for causing more reading
trouble for you.

Regards,
-Damodar

On Fri, Aug 3, 2012 at 7:29 PM, Gwern Branwen gwe...@gmail.com wrote:

 On Mon, Jul 30, 2012 at 6:59 PM, Alexander Solla alex.so...@gmail.com
 wrote:
  We could even have a report spam button on each page, and if enough
 users
  click on it (for a given revision), the revision gets forwarded to a
  moderator.

 This would be useless. The problem is not detecting spam, since that's
 quite trivial: it's very hard to miss. The problem is that the
 moderator (ie. me) is already overworked. The spam needs to be reduced
 to begin with, not detected.

 --
 gwern
 http://www.gwern.net

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-08-02 Thread wren ng thornton

On 7/30/12 5:35 PM, Henk-Jan van Tuyl wrote:

- Block creation of usernames
o ending with two or more digits
o with more than one x or q
o starting with buy
o longer than 20 characters
o with more than 4 consonants in a row


As other's've mentioned, many of these constraints impose undue burden 
on users with linguistic heritage outside of western Europe. Creating a 
decent filter for recognizing legitimate names across the majority of 
languages is quite difficult.


Though there's no reason this has to be a strong blacklisting of 
usernames. If there's a willing volunteer (as seems to have been 
implied), then something like this could serve as a filter requiring 
manual override. All usernames are available... but some take longer to 
activate. Of course, there's always the power-to-weight issue for this 
kind of solution.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-08-02 Thread Alexander Solla
On Thu, Aug 2, 2012 at 4:46 PM, wren ng thornton w...@freegeek.org wrote:

 On 7/30/12 5:35 PM, Henk-Jan van Tuyl wrote:

 - Block creation of usernames
 o ending with two or more digits
 o with more than one x or q
 o starting with buy
 o longer than 20 characters
 o with more than 4 consonants in a row


 As other's've mentioned, many of these constraints impose undue burden on
 users with linguistic heritage outside of western Europe. Creating a decent
 filter for recognizing legitimate names across the majority of languages is
 quite difficult.

 Though there's no reason this has to be a strong blacklisting of
 usernames. If there's a willing volunteer (as seems to have been implied),
 then something like this could serve as a filter requiring manual override.
 All usernames are available... but some take longer to activate. Of course,
 there's always the power-to-weight issue for this kind of solution.


Yeah, I volunteered.  I'd like to see some kind of random round-robin
system to dispatch approval edits to a group of volunteers (i.e., if I only
had to scan 10 or so edits for spam a day -- I don't feel inclined to read
for correctness).  It wouldn't be so bad if there was 10-20 volunteers. I
suppose a lot less could do it if it was just approving user requests (but,
I also think that would be less effective at stopping spam)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-08-02 Thread damodar kulkarni
  We could even have a report spam button on each page, and if enough
 users click on it (for a given revision), the revision gets forwarded to a
 moderator.


I think, this will be of real use, but should be used along with CAPTCHA
because then spammers may report spam for everything and anything on the
site.
But with captcha, it will be real helpful, as it means the moderation task
is more or less crowd-sourced.

regards,
Damodar

On Fri, Aug 3, 2012 at 7:36 AM, Alexander Solla alex.so...@gmail.comwrote:



 On Thu, Aug 2, 2012 at 4:46 PM, wren ng thornton w...@freegeek.orgwrote:

 On 7/30/12 5:35 PM, Henk-Jan van Tuyl wrote:

 - Block creation of usernames
 o ending with two or more digits
 o with more than one x or q
 o starting with buy
 o longer than 20 characters
 o with more than 4 consonants in a row


 As other's've mentioned, many of these constraints impose undue burden on
 users with linguistic heritage outside of western Europe. Creating a decent
 filter for recognizing legitimate names across the majority of languages is
 quite difficult.

 Though there's no reason this has to be a strong blacklisting of
 usernames. If there's a willing volunteer (as seems to have been implied),
 then something like this could serve as a filter requiring manual override.
 All usernames are available... but some take longer to activate. Of course,
 there's always the power-to-weight issue for this kind of solution.


 Yeah, I volunteered.  I'd like to see some kind of random round-robin
 system to dispatch approval edits to a group of volunteers (i.e., if I only
 had to scan 10 or so edits for spam a day -- I don't feel inclined to read
 for correctness).  It wouldn't be so bad if there was 10-20 volunteers. I
 suppose a lot less could do it if it was just approving user requests (but,
 I also think that would be less effective at stopping spam)

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-31 Thread Henk-Jan van Tuyl

On Tue, 31 Jul 2012 00:42:40 +0200, timothyho...@seznam.cz wrote:

On a side note, image based CAPACHA's can cause problems for blind  
people.


Googles ReCaptcha can pronounce the text to type.

Regards,
Henk-Jan van Tuyl


--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
Haskell programming
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-31 Thread Henk-Jan van Tuyl
On Tue, 31 Jul 2012 00:59:28 +0200, Alexander Solla alex.so...@gmail.com  
wrote:



Does anybody have statistics about how often pages are edited/added?


In the last seven days, there were 251 new (user)pages created; there was  
no spam added to existing pages.


I also discovered spam added to pages at  
http://hackage.haskell.org/trac/hackage/
A search for rio bouygues[0] gave 118 results, virgin mobile gave 124  
results; there are probably more.


Regards,
Henk-Jan van Tuyl


[0]  
http://hackage.haskell.org/trac/hackage/search?q=%22rio+bouygues%22noquickjump=1ticket=onmilestone=onwiki=on


--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
Haskell programming
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-30 Thread Henk-Jan van Tuyl
On Mon, 16 Jul 2012 00:03:49 +0200, Henk-Jan van Tuyl hjgt...@chello.nl  
wrote:




I am willing to do administrator tasks.


4. ReCAPTCHA enabled for 'edits adding new, unrecognized external
links' - which is all of the spam.


This is already enabled.


The HaskellWiki is still flooded with spam; we should take some measure to  
reduce the stream severely. Most spam seems to be created  
(semi-)automated; the pages do not contain links, the usernames end with  
two digits, most of the time. Some cures I have thought up:


 - Verify new wiki accounts, before granting them rights,
   based on e-mails in the Haskell mailing lists
   (or subscription of a Haskell mailing list)

 - Let new users only change pages, not create new pages

 - Block creation of usernames
o ending with two or more digits
o with more than one x or q
o starting with buy
o longer than 20 characters
o with more than 4 consonants in a row

 - Block creation of pages with words in a certain list
   (Coach, Vuitton, Chanel, handbags, purses, outlet, luggage, Nike Air  
Jordan)


Regards,
Henk-Jan van Tuyl


--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
Haskell programming
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-30 Thread timothyhobbs

Can we have at least 5 consonants?  There are enough people with names such
as Srbský in eastern European  In fact, the Czechs can make use of as
many as 9 consonants in a row!  http://ld.johanesville.net/perlicky/03-
jazykova-nej-a-jine-hricky




On a side note, image based CAPACHA's can cause problems for blind people.







-- Původní zpráva --
Od: Henk-Jan van Tuyl hjgt...@chello.nl
Datum: 30. 7. 2012
Předmět: Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Mon, 16 Jul 2012 00:03:49 +0200, Henk-Jan van Tuyl hjgt...@chello.nl
wrote:


 I am willing to do administrator tasks.

 4. ReCAPTCHA enabled for 'edits adding new, unrecognized external
 links' - which is all of the spam.

 This is already enabled.

The HaskellWiki is still flooded with spam; we should take some measure to
reduce the stream severely. Most spam seems to be created
(semi-)automated; the pages do not contain links, the usernames end with 
two digits, most of the time. Some cures I have thought up:

- Verify new wiki accounts, before granting them rights,
based on e-mails in the Haskell mailing lists
(or subscription of a Haskell mailing list)

- Let new users only change pages, not create new pages

- Block creation of usernames
o ending with two or more digits
o with more than one x or q
o starting with buy
o longer than 20 characters
o with more than 4 consonants in a row

- Block creation of pages with words in a certain list
(Coach, Vuitton, Chanel, handbags, purses, outlet, luggage, Nike Air
Jordan)

Regards,
Henk-Jan van Tuyl


--
http://Van.Tuyl.eu/(http://Van.Tuyl.eu/)
http://members.chello.nl/hjgtuyl/tourdemonad.html
(http://members.chello.nl/hjgtuyl/tourdemonad.html)
Haskell programming
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
(http://www.haskell.org/mailman/listinfo/haskell-cafe)___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-30 Thread Alexander Solla
On Mon, Jul 30, 2012 at 2:35 PM, Henk-Jan van Tuyl hjgt...@chello.nlwrote:


  - Verify new wiki accounts, before granting them rights,
based on e-mails in the Haskell mailing lists
(or subscription of a Haskell mailing list)


This is a nice idea, but I think it will end up moving spam onto the
mailing lists.  There is hardly any policy in place to keep people out of
the mailing lists.  Mailing list spam is attractive to spammers, since it
all gets mirrored to archive sites all over the place.

Not to volunteer others, but how feasible would it be to require
credentials from Haskellers.org?


  - Let new users only change pages, not create new pages


This is good for stopping the creation of walled gardens full of spam.  But
it won't stop vandalism spam, where somebody goes to a page that isn't
accessed much and changes it.

Does anybody have statistics about how often pages are edited/added?  If
the numbers aren't too big, I'd volunteer to moderate insofar as scanning
new edits/adds for spam.  Maybe this role should just forward articles with
spam on them to a real moderator to roll-back.  We could even have a
report spam button on each page, and if enough users click on it (for a
given revision), the revision gets forwarded to a moderator.


  - Block creation of usernames
 o ending with two or more digits
 o with more than one x or q
 o starting with buy
 o longer than 20 characters
 o with more than 4 consonants in a row


I don't see this providing any security against spam, and I'm thinking it
will take longer to implement than it will take for a spammer to fix his
scripts in response.


  - Block creation of pages with words in a certain list
(Coach, Vuitton, Chanel, handbags, purses, outlet, luggage, Nike Air
 Jordan)


 Same.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-30 Thread Ricardo Wurmus
On 31 July 2012 05:35, Henk-Jan van Tuyl hjgt...@chello.nl wrote:

... with more than one x or q

This would exclude legitimate Chinese (pinyin) usernames for not much gain.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki

2012-07-30 Thread Michael Orlitzky
On 07/30/2012 05:35 PM, Henk-Jan van Tuyl wrote:
 On Mon, 16 Jul 2012 00:03:49 +0200, Henk-Jan van Tuyl hjgt...@chello.nl  
 wrote:
 

 I am willing to do administrator tasks.

 4. ReCAPTCHA enabled for 'edits adding new, unrecognized external
 links' - which is all of the spam.

 This is already enabled.
 
 The HaskellWiki is still flooded with spam; we should take some measure to  
 reduce the stream severely. Most spam seems to be created  
 (semi-)automated; the pages do not contain links, the usernames end with  
 two digits, most of the time. Some cures I have thought up:
 

There are two (easy) things that will make a huge dent in the automated
stuff.

  1. Add a fake field, hidden through CSS, labeled something like You
 must leave this field blank to submit the form (for non-visual
 browsers). Put it on every page with a submit button. If it isn't
 empty, don't process the submission. You can give it a /name/ that
 sounds tempting, though.

  2. Force previews. If the bots are targeted at your wiki software and
 you modify it to preview all submissions, the bots will stop
 working.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe