Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
Anyone ran SpamAssassin on the offending content created by the spammers? I've been using it on hpaste and it's been very effective at cutting out the crap. On 4 August 2012 19:15, Gwern Branwen gwe...@gmail.com wrote: On Fri, Aug 3, 2012 at 10:34 PM, damodar kulkarni kdamodar2...@gmail.com wrote: So, another doubt, if detecting spam is trivial, then why not just send the detected spam to trash directly without any human inspection? This may mean some trouble for the posters due to false positives; but the moderator's job can be reduced to some extent. Which is pretty much what this whole thread is about: asking that the sysadmins Do Something about this trivial yet overwhelming spam. -- gwern http://www.gwern.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Fri, Aug 3, 2012 at 10:34 PM, damodar kulkarni kdamodar2...@gmail.com wrote: So, another doubt, if detecting spam is trivial, then why not just send the detected spam to trash directly without any human inspection? This may mean some trouble for the posters due to false positives; but the moderator's job can be reduced to some extent. Which is pretty much what this whole thread is about: asking that the sysadmins Do Something about this trivial yet overwhelming spam. -- gwern http://www.gwern.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Mon, Jul 30, 2012 at 6:59 PM, Alexander Solla alex.so...@gmail.com wrote: We could even have a report spam button on each page, and if enough users click on it (for a given revision), the revision gets forwarded to a moderator. This would be useless. The problem is not detecting spam, since that's quite trivial: it's very hard to miss. The problem is that the moderator (ie. me) is already overworked. The spam needs to be reduced to begin with, not detected. -- gwern http://www.gwern.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
Hi Gwern, First of all, thanks for your patience. I am willing to do administrator tasks. 4. ReCAPTCHA enabled for 'edits adding new, unrecognized external links' - which is all of the spam. This is already enabled. I guess the problem may be due to ReCAPTCHAhttp://www.google.com/recaptcha/learnmore; so you can choose to use a custom built CAPTCHA that is more difficult to crack. You may find some open source captcha systems better than the ReCAPTCHA. http://jcaptcha.sourceforge.net/ To forge the relay attacks on CAPTCHA, you may try early timeouts and/or increasing length of CAPTCHA text. This potentially may mean more trouble and nuisance to legit users, but I guess, the Haskellers will be willing to pay this small price for a better web-site experience for them. :) Relay attacks: Remember that there are human solvers employed in countries like India, China, so any human solvable captcha will fail to work as desired. http://en.wikipedia.org/wiki/CAPTCHA#Human_solvers The problem is not detecting spam, since that's quite trivial: it's very hard to miss. Thanks for providing more info. So, another doubt, if detecting spam is trivial, then why not just send the detected spam to trash directly without any human inspection? This may mean some trouble for the posters due to false positives; but the moderator's job can be reduced to some extent. I hope, this is useful. If not, please forgive me for causing more reading trouble for you. Regards, -Damodar On Fri, Aug 3, 2012 at 7:29 PM, Gwern Branwen gwe...@gmail.com wrote: On Mon, Jul 30, 2012 at 6:59 PM, Alexander Solla alex.so...@gmail.com wrote: We could even have a report spam button on each page, and if enough users click on it (for a given revision), the revision gets forwarded to a moderator. This would be useless. The problem is not detecting spam, since that's quite trivial: it's very hard to miss. The problem is that the moderator (ie. me) is already overworked. The spam needs to be reduced to begin with, not detected. -- gwern http://www.gwern.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On 7/30/12 5:35 PM, Henk-Jan van Tuyl wrote: - Block creation of usernames o ending with two or more digits o with more than one x or q o starting with buy o longer than 20 characters o with more than 4 consonants in a row As other's've mentioned, many of these constraints impose undue burden on users with linguistic heritage outside of western Europe. Creating a decent filter for recognizing legitimate names across the majority of languages is quite difficult. Though there's no reason this has to be a strong blacklisting of usernames. If there's a willing volunteer (as seems to have been implied), then something like this could serve as a filter requiring manual override. All usernames are available... but some take longer to activate. Of course, there's always the power-to-weight issue for this kind of solution. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Thu, Aug 2, 2012 at 4:46 PM, wren ng thornton w...@freegeek.org wrote: On 7/30/12 5:35 PM, Henk-Jan van Tuyl wrote: - Block creation of usernames o ending with two or more digits o with more than one x or q o starting with buy o longer than 20 characters o with more than 4 consonants in a row As other's've mentioned, many of these constraints impose undue burden on users with linguistic heritage outside of western Europe. Creating a decent filter for recognizing legitimate names across the majority of languages is quite difficult. Though there's no reason this has to be a strong blacklisting of usernames. If there's a willing volunteer (as seems to have been implied), then something like this could serve as a filter requiring manual override. All usernames are available... but some take longer to activate. Of course, there's always the power-to-weight issue for this kind of solution. Yeah, I volunteered. I'd like to see some kind of random round-robin system to dispatch approval edits to a group of volunteers (i.e., if I only had to scan 10 or so edits for spam a day -- I don't feel inclined to read for correctness). It wouldn't be so bad if there was 10-20 volunteers. I suppose a lot less could do it if it was just approving user requests (but, I also think that would be less effective at stopping spam) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
We could even have a report spam button on each page, and if enough users click on it (for a given revision), the revision gets forwarded to a moderator. I think, this will be of real use, but should be used along with CAPTCHA because then spammers may report spam for everything and anything on the site. But with captcha, it will be real helpful, as it means the moderation task is more or less crowd-sourced. regards, Damodar On Fri, Aug 3, 2012 at 7:36 AM, Alexander Solla alex.so...@gmail.comwrote: On Thu, Aug 2, 2012 at 4:46 PM, wren ng thornton w...@freegeek.orgwrote: On 7/30/12 5:35 PM, Henk-Jan van Tuyl wrote: - Block creation of usernames o ending with two or more digits o with more than one x or q o starting with buy o longer than 20 characters o with more than 4 consonants in a row As other's've mentioned, many of these constraints impose undue burden on users with linguistic heritage outside of western Europe. Creating a decent filter for recognizing legitimate names across the majority of languages is quite difficult. Though there's no reason this has to be a strong blacklisting of usernames. If there's a willing volunteer (as seems to have been implied), then something like this could serve as a filter requiring manual override. All usernames are available... but some take longer to activate. Of course, there's always the power-to-weight issue for this kind of solution. Yeah, I volunteered. I'd like to see some kind of random round-robin system to dispatch approval edits to a group of volunteers (i.e., if I only had to scan 10 or so edits for spam a day -- I don't feel inclined to read for correctness). It wouldn't be so bad if there was 10-20 volunteers. I suppose a lot less could do it if it was just approving user requests (but, I also think that would be less effective at stopping spam) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Tue, 31 Jul 2012 00:42:40 +0200, timothyho...@seznam.cz wrote: On a side note, image based CAPACHA's can cause problems for blind people. Googles ReCaptcha can pronounce the text to type. Regards, Henk-Jan van Tuyl -- http://Van.Tuyl.eu/ http://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Tue, 31 Jul 2012 00:59:28 +0200, Alexander Solla alex.so...@gmail.com wrote: Does anybody have statistics about how often pages are edited/added? In the last seven days, there were 251 new (user)pages created; there was no spam added to existing pages. I also discovered spam added to pages at http://hackage.haskell.org/trac/hackage/ A search for rio bouygues[0] gave 118 results, virgin mobile gave 124 results; there are probably more. Regards, Henk-Jan van Tuyl [0] http://hackage.haskell.org/trac/hackage/search?q=%22rio+bouygues%22noquickjump=1ticket=onmilestone=onwiki=on -- http://Van.Tuyl.eu/ http://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Mon, 16 Jul 2012 00:03:49 +0200, Henk-Jan van Tuyl hjgt...@chello.nl wrote: I am willing to do administrator tasks. 4. ReCAPTCHA enabled for 'edits adding new, unrecognized external links' - which is all of the spam. This is already enabled. The HaskellWiki is still flooded with spam; we should take some measure to reduce the stream severely. Most spam seems to be created (semi-)automated; the pages do not contain links, the usernames end with two digits, most of the time. Some cures I have thought up: - Verify new wiki accounts, before granting them rights, based on e-mails in the Haskell mailing lists (or subscription of a Haskell mailing list) - Let new users only change pages, not create new pages - Block creation of usernames o ending with two or more digits o with more than one x or q o starting with buy o longer than 20 characters o with more than 4 consonants in a row - Block creation of pages with words in a certain list (Coach, Vuitton, Chanel, handbags, purses, outlet, luggage, Nike Air Jordan) Regards, Henk-Jan van Tuyl -- http://Van.Tuyl.eu/ http://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
Can we have at least 5 consonants? There are enough people with names such as Srbský in eastern European In fact, the Czechs can make use of as many as 9 consonants in a row! http://ld.johanesville.net/perlicky/03- jazykova-nej-a-jine-hricky On a side note, image based CAPACHA's can cause problems for blind people. -- Původní zpráva -- Od: Henk-Jan van Tuyl hjgt...@chello.nl Datum: 30. 7. 2012 Předmět: Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki On Mon, 16 Jul 2012 00:03:49 +0200, Henk-Jan van Tuyl hjgt...@chello.nl wrote: I am willing to do administrator tasks. 4. ReCAPTCHA enabled for 'edits adding new, unrecognized external links' - which is all of the spam. This is already enabled. The HaskellWiki is still flooded with spam; we should take some measure to reduce the stream severely. Most spam seems to be created (semi-)automated; the pages do not contain links, the usernames end with two digits, most of the time. Some cures I have thought up: - Verify new wiki accounts, before granting them rights, based on e-mails in the Haskell mailing lists (or subscription of a Haskell mailing list) - Let new users only change pages, not create new pages - Block creation of usernames o ending with two or more digits o with more than one x or q o starting with buy o longer than 20 characters o with more than 4 consonants in a row - Block creation of pages with words in a certain list (Coach, Vuitton, Chanel, handbags, purses, outlet, luggage, Nike Air Jordan) Regards, Henk-Jan van Tuyl -- http://Van.Tuyl.eu/(http://Van.Tuyl.eu/) http://members.chello.nl/hjgtuyl/tourdemonad.html (http://members.chello.nl/hjgtuyl/tourdemonad.html) Haskell programming -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe (http://www.haskell.org/mailman/listinfo/haskell-cafe)___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On Mon, Jul 30, 2012 at 2:35 PM, Henk-Jan van Tuyl hjgt...@chello.nlwrote: - Verify new wiki accounts, before granting them rights, based on e-mails in the Haskell mailing lists (or subscription of a Haskell mailing list) This is a nice idea, but I think it will end up moving spam onto the mailing lists. There is hardly any policy in place to keep people out of the mailing lists. Mailing list spam is attractive to spammers, since it all gets mirrored to archive sites all over the place. Not to volunteer others, but how feasible would it be to require credentials from Haskellers.org? - Let new users only change pages, not create new pages This is good for stopping the creation of walled gardens full of spam. But it won't stop vandalism spam, where somebody goes to a page that isn't accessed much and changes it. Does anybody have statistics about how often pages are edited/added? If the numbers aren't too big, I'd volunteer to moderate insofar as scanning new edits/adds for spam. Maybe this role should just forward articles with spam on them to a real moderator to roll-back. We could even have a report spam button on each page, and if enough users click on it (for a given revision), the revision gets forwarded to a moderator. - Block creation of usernames o ending with two or more digits o with more than one x or q o starting with buy o longer than 20 characters o with more than 4 consonants in a row I don't see this providing any security against spam, and I'm thinking it will take longer to implement than it will take for a spammer to fix his scripts in response. - Block creation of pages with words in a certain list (Coach, Vuitton, Chanel, handbags, purses, outlet, luggage, Nike Air Jordan) Same. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On 31 July 2012 05:35, Henk-Jan van Tuyl hjgt...@chello.nl wrote: ... with more than one x or q This would exclude legitimate Chinese (pinyin) usernames for not much gain. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Haskell] Spam on the Haskell wiki
On 07/30/2012 05:35 PM, Henk-Jan van Tuyl wrote: On Mon, 16 Jul 2012 00:03:49 +0200, Henk-Jan van Tuyl hjgt...@chello.nl wrote: I am willing to do administrator tasks. 4. ReCAPTCHA enabled for 'edits adding new, unrecognized external links' - which is all of the spam. This is already enabled. The HaskellWiki is still flooded with spam; we should take some measure to reduce the stream severely. Most spam seems to be created (semi-)automated; the pages do not contain links, the usernames end with two digits, most of the time. Some cures I have thought up: There are two (easy) things that will make a huge dent in the automated stuff. 1. Add a fake field, hidden through CSS, labeled something like You must leave this field blank to submit the form (for non-visual browsers). Put it on every page with a submit button. If it isn't empty, don't process the submission. You can give it a /name/ that sounds tempting, though. 2. Force previews. If the bots are targeted at your wiki software and you modify it to preview all submissions, the bots will stop working. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe