Re: [OSM-dev] Chinese spam diaries, an analysis

2014-12-04 Thread Andreas Labres
On 03.12.14 17:14, Andy Allan wrote:
 Thanks for the analysis, I hope it provides developers with ideas for
 combatting it via the automated spam filters that we already have[1].

I'd suggest to extend/refine the automated filter somewhat. Say:

* a novice ist not allowed to post at all
* a novice who did some changesets is allowed to post say once per day
* an intermediate is allowed to post say once per hour
* for an expert (either subscribed for years or lots of changesets) the
posting limit is waived

One could even think to allow experts to delete other user's posts (because of
spam). Of course a log has to be maintained. And so no special people
(moderators etc.) are needed!

And of course the parameters need to be optimized:

* how long is a user a novice?
* is 10 changesets enough to allow him/her to post?
* when does the intermediate level start? 2 years? 100 changesets?
* what are the achievements to reach the expert level? 4 years? 1000 
changesets?

Those parameters could be tweaked on the fly, I'd say.

/al

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Chinese spam diaries, an analysis

2014-12-04 Thread Tom Hughes

On 04/12/14 11:17, Andreas Labres wrote:

On 03.12.14 17:14, Andy Allan wrote:

Thanks for the analysis, I hope it provides developers with ideas for
combatting it via the automated spam filters that we already have[1].


I'd suggest to extend/refine the automated filter somewhat. Say:

* a novice ist not allowed to post at all
* a novice who did some changesets is allowed to post say once per day
* an intermediate is allowed to post say once per hour
* for an expert (either subscribed for years or lots of changesets) the
posting limit is waived


So in other words, most of things we already factor in to our spam 
scoring... We're just not quite as rigid.


In particular you can still post (within reason) without having made any 
edits - it is actually surprisingly common for non-spammers to do that.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Chinese spam diaries, an analysis

2014-12-04 Thread Tom Hughes

On 03/12/14 16:14, Andy Allan wrote:


However, spam is an arms race, and I think we might need a different
long-term approach. I know in the past using 3rd-party spam filtering
services was too expensive (and not really very OSM-ish either).


The main such system is akmiset and I'd love to use it but (a) it costs 
money and (b) to make it most effective we would have to send it things 
like email addresses and IP addresses which I figure people may object to.



Perhaps we need a new set of human content moderators on the site, say
40-80 people with a variety of languages between them. We can consider
grey-listing all accounts - i.e. the first few posts of every account
is held for review automatically by default, and enable direct posting
after we're more certain they aren't a spammer.


Once we have a review queue and moderator system then obviously it 
becomes trivial to do things like holding posts from new users for 
moderation - we need the basic infrastructure first though.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Chinese spam diaries, an analysis

2014-12-04 Thread Andreas Labres
On 04.12.14 12:33, Tom Hughes wrote:
 So in other words, most of things we already factor in to our spam scoring...
 We're just not quite as rigid.

A (hidden) spam score is bad (IMO). Nobody sees it, almost nobody can test it.

A documented user level with documented rules would make much more sense and
(IMO) would much more likely be accepted.

 In particular you can still post (within reason) without having made any edits
 - it is actually surprisingly common for non-spammers to do that. 

OSM is not a blog site. OSM is about making the data better. Once you have
somehow figured out a little bit how OSM works, you could blog about it. IMO.

/al

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Chinese spam diaries, an analysis

2014-12-04 Thread Tom Hughes

On 04/12/14 12:06, Andreas Labres wrote:

On 04.12.14 12:33, Tom Hughes wrote:

So in other words, most of things we already factor in to our spam scoring...
We're just not quite as rigid.


A (hidden) spam score is bad (IMO). Nobody sees it, almost nobody can test it.


Nothing is hidden:

https://github.com/openstreetmap/openstreetmap-website/blob/master/app/models/user.rb#L210

Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev