It would help to add a comment column to motivate flagging changeset/object
content has a spam (SEO marketing description/infos, etc.).
Pierre
Le lundi 5 mars 2018 13:05:41 HNE, Jason Remillard
<[email protected]> a écrit :
Hi Dave,
The detector needs to be "trained" on what a spam changeset looks like versus
what a normal changeset looks like. Training really means programming the
detector by example.
Once we have a good set of example changesets, going forward, it will find them
on its own.
Rather than having me or Fredrick decide what is SPAM is or not, getting a
diverse set of changeset from many people will insure that the algorithm is not
biased relative to where the consensus is in the project. That is why I posed
this to talk not dev. People that map are needed for this task.
Finally, this is just a software component. It will still need to be integrated
into final end user tools. By doing the specialized machine learning code
first, I am hoping to get some collaborators that are interested in integrating
this into tools that everybody can use. But without the curated changeset list,
it is going nowhere. Long term, hopefully it will get integrated into several
tools...
Jason
On Mon, Mar 5, 2018 at 12:42 PM, Dave F <[email protected]> wrote:
Struggling to understand this
If users are expected to send you changeset ids, how does it "detect spam"?
In what way are users informed of spammy changesets?
DaveF
On 05/03/2018 14:06, Jason Remillard wrote:
Hi,
This weekend I put together a SPAM detector for OSM changesets.
https://github.com/jremillard/ osm-changeset-classification
You don't need to be a developer to contribute, send over any SPAM'y
changesets you come across via a github issue, a pull request, or even an email
to me. I just need the changeset id.
The code is currently hitting 99+% accuracy detecting the difference between
1500 random normal edits and 1500 sketchy changesets that Fredrick shared with
the talk-us last last week. This is with zero tuning, so it looks like it will
work well.
Jason
______________________________ _________________
talk mailing list
[email protected]
https://lists.openstreetmap. org/listinfo/talk
______________________________ _________________
talk mailing list
[email protected]
https://lists.openstreetmap. org/listinfo/talk
_______________________________________________
talk mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/talk
_______________________________________________
talk mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/talk