Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-31 Thread Florian Effenberger

Hello,

thanks a lot for the insight on this and the great summary! I won't jump 
in in the current discussion, since others are much more AskBot experts 
here. I've told Alex on the phone already that for the wiki, the 
QuestyCaptcha did the trick (multiple choice questions), while any other 
captcha failed to work - whyever. ;-)


Robinson Tryon wrote on 2013-01-31 01:59:

(1) As a hosted, commercial solution, Akismet is running some
proprietary software stack. So no Free/Libre/Open Source Software love
here.

(2) Akismet is free for (some) personal use, but costs something more
in all other cases[3]. That might be $50/month, $100/month, or they
might just be nice and give us some kind of discount. It's pretty
nebulous here.


Akismet is a legal problem. I did not investigate deeper, but from what 
I remember, it means transferring data from the EU into the US. Doing so 
requires a very carefully crafted privacy policy and further research 
plus precautiouns, and might even be illegal for us to do, making us 
subject to a cease and desist letter.


Don't shoot the messenger - I'm not saying this is good, but that's what 
I found out...


Here's a short summary (in German) on this: 
http://blog.wpde.org/2011/04/20/akismet-und-datenschutz-einwilligung-per-opt-in-notwendig.html


Florian

--
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-31 Thread Robinson Tryon
On Thu, Jan 31, 2013 at 2:54 AM, Jonathan Aquilina
eagles051...@gmail.com wrote:
 No need to re roll anything all you have to do and anyone with a google
 account can sign up with it if there is a plugin all you need to do is
 provide the recaptcha keys.

Provide the keys to which plugin?


 Its very simple to do and integrate i am running 12 wordpress sites and they
 are all using it.

Right, but that integration is so simple because there's a plugin for
Wordpress to use RECAPTCHA. I haven't found anything like
that for Askbot yet. That's why I suggested that we could consider
rolling something ourselves.

This project looks promising: https://github.com/praekelt/django-recaptcha

--R

-- 
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-31 Thread Jonathan Aquilina
I understand where your coming from the django-recaptcha introduces a whole
new level of webserver complexity as django is a python based web
framework, and would require a high learning curve to implement it. you
would also need to run apache's wsgi module as well.


On Thu, Jan 31, 2013 at 11:58 AM, Robinson Tryon
bishop.robin...@gmail.comwrote:

 On Thu, Jan 31, 2013 at 2:54 AM, Jonathan Aquilina
 eagles051...@gmail.com wrote:
  No need to re roll anything all you have to do and anyone with a google
  account can sign up with it if there is a plugin all you need to do is
  provide the recaptcha keys.

 Provide the keys to which plugin?

 
  Its very simple to do and integrate i am running 12 wordpress sites and
 they
  are all using it.

 Right, but that integration is so simple because there's a plugin for
 Wordpress to use RECAPTCHA. I haven't found anything like
 that for Askbot yet. That's why I suggested that we could consider
 rolling something ourselves.

 This project looks promising: https://github.com/praekelt/django-recaptcha

 --R




-- 
Jonathan Aquilina

-- 
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-31 Thread Robinson Tryon
On Thu, Jan 31, 2013 at 6:45 AM, Jonathan Aquilina
eagles051...@gmail.com wrote:
 I understand where your coming from the django-recaptcha introduces a whole
 new level of webserver complexity as django is a python based web framework,

I believe that Askbot is written in Python, on top of Django.

https://en.wikipedia.org/wiki/Askbot

That's why I suggested using a RECAPTCHA tool written in Python,
specifically designed for ease of use with a Django application.

 and would require a high learning curve to implement it. you would also need
 to run apache's wsgi module as well.

I believe that's already a dependency for Askbot

http://askbot.org/doc/deployment.html

Installation under Apache/mod_wsgi

Apache/mod_wsgi combination is the only type of deployment described
in this document at the moment. mod_wsgi is currently the most
resource efficient apache handler for the Python web applications.


Cheers,
-- R

-- 
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-31 Thread Alexander Werner
Hi Jonathan, Robinson,

Am 31.01.2013 um 20:13 schrieb Jonathan Aquilina:
 I had not realized that ask was python based. I am honestly suprised there
 isnt something out there for django or even ask in terms of recaptcha
 already available. If i had the time which i dont right now at all i would
 gladly code up something.


Askbot indeed has basic recaptcha support, but this is only used for local 
signup. As this is at the moment disabled (the version we're using doesn't 
support double opt-in, which is a requirement by german law), we have to rely 
on OpenID provider. Most of the spammers use yahoo and google accounts, so they 
were already able to bypass their captchas. 

I'm currently working on merging in the changes up to Askbot 0.7.48 from two 
days ago, which needs some work - changing images, adapting the theme again 
etc. As soon this is done, I'll implement the function Robinson suggested - 
defining a karma threshold. Users with, for example, less than 10 karma are 
moderated, users with more not. I think this is the most native way of 
handling this problem.
Otherwise: If someone finds a compatible alternative to Akismet that is based 
in Europe or available to deploy on our own infrastructure, that would be also 
great.

Cu,
Alex

FYI: We are using nginx+uwsgi to deploy Askbot on our infra. This has proven 
much more uncomplicated than mod_wsgi+apache when it comes to using multiple 
virtualenvs in the same vhost, and does scale well (the recently occurring 
problems were caused by multithreading issues, nothing to do with our 
deployment setup).


--
Alexander Werner a...@documentfoundation.org
Admin Team of The Document Foundation
The Document Foundation, Zimmerstr. 69, 10117 Berlin, Germany
Rechtsfähige Stiftung des bürgerlichen Rechts
Legal details: http://www.documentfoundation.org/imprint





-- 
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted



[libreoffice-website] Ask site: Canning our SPAM problem

2013-01-30 Thread Robinson Tryon
Hi everyone,

We've been dealing with a bunch of SPAM on the Ask site over the last
couple of weeks. We recently enabled moderation on user content in an
attempt to stem the vandalism of the front page, and I've spent the
last couple of days investigating some possibilities for the site. As
you might expect, I haven't found any silver bullets, but I do have a
few suggestions that can help us manage the problem.

Based on my research on the Askbot.org site, a number of SEO-driven
spammers have taken to posting on Askbot sites. It's not clear what
percentage of these postings are run by scripts, but it's my
understanding that some of the spammers are using some form of
automation. On the AskLibo site, we're seeing not only SPAM in
questions, but we're seeing SPAM in _follow-up_ answers as well. These
guys are organized, and probably aren't just going to go away.

The Askbot software includes a few different mechanisms for dealing
with spam and malicious users:

- Content Deletion
- User Bans/Blocks
- Content Moderation
- Registration Limitations
- Akismet SPAM protection/detection

Content Deletion and User Bans/Blocks are pretty straightforward.
Mods, Admins, and some users with high karma may delete content from
the site. Users with less karma may flag content. All users spamming
the site are banned (it's been pretty cut-n-dry up to this point).

Content Moderation entails all user content going into a queue that is
reviewed by the mods and then accepted for the site or deleted. This
is what we have enabled right now.

Since we've enabled Content Moderation, the spam on the site has
basically disappeared [1]. We've approved a lot of content by
legitimate users, and we've banned and deleted a number of
spammers/spam content.

From the moderator side, Manfred (@manj_k) and I both agree that
enabling moderation has given us more work, since we want to make sure
that content does not sit for long in the moderation queue. That being
said, before moderation was enabled, SPAM would sometimes sit on the
site for hours at a time, and I've heard multiple LO
users/contributors refer to the site as a SPAM pit or totally
messed up. We've also run into some hiccups with the moderation
interface which prevented us from actually being able to approve any
questions.

From the user side, we've only gotten feedback from a few Ask users.
Reception was mixed, with an overall feeling that moderation is an
acceptable short-term solution, but needs a better long-term fix. One
of the users suggested... (a perfect segue)

...Registration Limitations

Other sites have used registration limitations to reduce/eliminate
spam on their Askbot site. One mechanism suggested to us was to
require our users to have an email address from a paid provider.
That is, no gmail, yahoo, hotmail, etc... accounts would be allowed
(at least not during registration).

The suggestion is clever and powerful, and could probably serve as an
excellent filter on spam, but my concern with such a proposal is that
many legitimate users rely on a free provider for their email, and
such an hard rule might discourage many of our meekest users from
participating on the site and getting help from us.

Next up in our quiver of features is Akismet SPAM
protection/detection. I don't know too much about this product, but
it's basically an external SPAM-filtering service. People rave about
it on this wordpress page[2], and I've seen it in action on a
Wordpress hosted blog. To quote Ron Popeil, you set it and forget
it. There are two possible issues with Akismet, (1) it ain't Free,
and (2) it ain't free.

(1) As a hosted, commercial solution, Akismet is running some
proprietary software stack. So no Free/Libre/Open Source Software love
here.

(2) Akismet is free for (some) personal use, but costs something more
in all other cases[3]. That might be $50/month, $100/month, or they
might just be nice and give us some kind of discount. It's pretty
nebulous here.

I'm not sure whether (1) or (2) would preclude Akismet being used in
the LO infrastructure, but I figured I'd include the software/service
in the list and let someone else tell me the rules :-)

---

That mostly sums-up what's present in the Askbot software *right now*.
There are some other options that have been proposed for future
development in Askbot including IP-based tools (Blocking by IP, User
identification by IP, etc..), karma-based moderation, and group-based
moderation. Some of these ideas are discussed on the Askbot site for
Askbot[4].

I've asked a couple of questions on the Askbot.org site[5] and have
received very prompt replies from the creator of the software, Evgeny
Fadeev. One proposal I made was to allow mods to tweak the application
of Content Moderation so that it only applies to a certain subgroup of
users, for example:

 * All users with karma  MIN_KARMA_TO_SKIP_MODERATION
 * All users who are not in a group flagged to SKIP_MODERATION, or
 * All users whose account is younger than 

Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-30 Thread Jonathan Aquilina
I think on all sites Recaptcha should be enabled, It would help to stem as
well spam on the site and block spam users at registration. its owned by
google and free to use.


On Thu, Jan 31, 2013 at 1:59 AM, Robinson Tryon
bishop.robin...@gmail.comwrote:

 Hi everyone,

 We've been dealing with a bunch of SPAM on the Ask site over the last
 couple of weeks. We recently enabled moderation on user content in an
 attempt to stem the vandalism of the front page, and I've spent the
 last couple of days investigating some possibilities for the site. As
 you might expect, I haven't found any silver bullets, but I do have a
 few suggestions that can help us manage the problem.

 Based on my research on the Askbot.org site, a number of SEO-driven
 spammers have taken to posting on Askbot sites. It's not clear what
 percentage of these postings are run by scripts, but it's my
 understanding that some of the spammers are using some form of
 automation. On the AskLibo site, we're seeing not only SPAM in
 questions, but we're seeing SPAM in _follow-up_ answers as well. These
 guys are organized, and probably aren't just going to go away.

 The Askbot software includes a few different mechanisms for dealing
 with spam and malicious users:

 - Content Deletion
 - User Bans/Blocks
 - Content Moderation
 - Registration Limitations
 - Akismet SPAM protection/detection

 Content Deletion and User Bans/Blocks are pretty straightforward.
 Mods, Admins, and some users with high karma may delete content from
 the site. Users with less karma may flag content. All users spamming
 the site are banned (it's been pretty cut-n-dry up to this point).

 Content Moderation entails all user content going into a queue that is
 reviewed by the mods and then accepted for the site or deleted. This
 is what we have enabled right now.

 Since we've enabled Content Moderation, the spam on the site has
 basically disappeared [1]. We've approved a lot of content by
 legitimate users, and we've banned and deleted a number of
 spammers/spam content.

 From the moderator side, Manfred (@manj_k) and I both agree that
 enabling moderation has given us more work, since we want to make sure
 that content does not sit for long in the moderation queue. That being
 said, before moderation was enabled, SPAM would sometimes sit on the
 site for hours at a time, and I've heard multiple LO
 users/contributors refer to the site as a SPAM pit or totally
 messed up. We've also run into some hiccups with the moderation
 interface which prevented us from actually being able to approve any
 questions.

 From the user side, we've only gotten feedback from a few Ask users.
 Reception was mixed, with an overall feeling that moderation is an
 acceptable short-term solution, but needs a better long-term fix. One
 of the users suggested... (a perfect segue)

 ...Registration Limitations

 Other sites have used registration limitations to reduce/eliminate
 spam on their Askbot site. One mechanism suggested to us was to
 require our users to have an email address from a paid provider.
 That is, no gmail, yahoo, hotmail, etc... accounts would be allowed
 (at least not during registration).

 The suggestion is clever and powerful, and could probably serve as an
 excellent filter on spam, but my concern with such a proposal is that
 many legitimate users rely on a free provider for their email, and
 such an hard rule might discourage many of our meekest users from
 participating on the site and getting help from us.

 Next up in our quiver of features is Akismet SPAM
 protection/detection. I don't know too much about this product, but
 it's basically an external SPAM-filtering service. People rave about
 it on this wordpress page[2], and I've seen it in action on a
 Wordpress hosted blog. To quote Ron Popeil, you set it and forget
 it. There are two possible issues with Akismet, (1) it ain't Free,
 and (2) it ain't free.

 (1) As a hosted, commercial solution, Akismet is running some
 proprietary software stack. So no Free/Libre/Open Source Software love
 here.

 (2) Akismet is free for (some) personal use, but costs something more
 in all other cases[3]. That might be $50/month, $100/month, or they
 might just be nice and give us some kind of discount. It's pretty
 nebulous here.

 I'm not sure whether (1) or (2) would preclude Akismet being used in
 the LO infrastructure, but I figured I'd include the software/service
 in the list and let someone else tell me the rules :-)

 ---

 That mostly sums-up what's present in the Askbot software *right now*.
 There are some other options that have been proposed for future
 development in Askbot including IP-based tools (Blocking by IP, User
 identification by IP, etc..), karma-based moderation, and group-based
 moderation. Some of these ideas are discussed on the Askbot site for
 Askbot[4].

 I've asked a couple of questions on the Askbot.org site[5] and have
 received very prompt replies from the creator of the software, Evgeny
 

Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-30 Thread Robinson Tryon
On Thu, Jan 31, 2013 at 1:02 AM, Jonathan Aquilina
eagles051...@gmail.com wrote:
 I think on all sites Recaptcha should be enabled, It would help to stem as
 well spam on the site and block spam users at registration. its owned by
 google and free to use.

Sounds like a good plan to me. One of the questions I posed at the
Askbot site was about CAPTCHA integration:
http://askbot.org/en/question/9874/how-can-i-integrate-captcha-into-account-creationregistration/

If we were to roll our own (RE)CAPTCHA solution for our Ask site, I'm
pretty sure that upstream would be receptive to our patches.

--R

-- 
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-website] Ask site: Canning our SPAM problem

2013-01-30 Thread Jonathan Aquilina
No need to re roll anything all you have to do and anyone with a google
account can sign up with it if there is a plugin all you need to do is
provide the recaptcha keys.

Its very simple to do and integrate i am running 12 wordpress sites and
they are all using it.


On Thu, Jan 31, 2013 at 8:52 AM, Robinson Tryon
bishop.robin...@gmail.comwrote:

 On Thu, Jan 31, 2013 at 1:02 AM, Jonathan Aquilina
 eagles051...@gmail.com wrote:
  I think on all sites Recaptcha should be enabled, It would help to stem
 as
  well spam on the site and block spam users at registration. its owned by
  google and free to use.

 Sounds like a good plan to me. One of the questions I posed at the
 Askbot site was about CAPTCHA integration:

 http://askbot.org/en/question/9874/how-can-i-integrate-captcha-into-account-creationregistration/

 If we were to roll our own (RE)CAPTCHA solution for our Ask site, I'm
 pretty sure that upstream would be receptive to our patches.

 --R




-- 
Jonathan Aquilina

-- 
Unsubscribe instructions: E-mail to website+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/website/
All messages sent to this list will be publicly archived and cannot be deleted