Hi Animesh,

You can check out the vandalism page on the OSM Wiki that provides a pretty 
good overview about OSM vandalism and the challenge of detecting it [0].


I think a binary classification won't be as straightforward for creating a wide 
sweeping 'vandal detection' tool because the problem of vandalism in OSM is 
multifaceted: you can change many objects in one changeset and each object 
itself has multiple dimensions  (there's the spatial dimensions--shape, detail, 
etc.--and then there's the data property dimensions). In addition, sometimes 
the line between poor quality edits and vandalism is very thin, so vandalism 
may not be the result of malice but rather just an uninformed editor.



Thus, for a binary classification, it would be useful to focus on one type of 
vandalism. Perhaps it could be detecting doodles (in which case you could 
search for data that isn't normal shaped: small angles, very high detail, and 
so on). Or it could be finding times when people are deleting a lot of data. I 
started a form that aims to collect "bad" edits in general [1], but I haven't 
really advertised it and thus don't have data that could help inform which 
direction would be most commonly found.



You may also check out some of the projects that have implemented parts of the 
algorithms listed on the wiki page for further inspiration [2,3,4].



Best,

Ethan aka FTA


[0]: http://wiki.openstreetmap.org/wiki/Vandalism

[1]: 
https://docs.google.com/forms/d/e/1FAIpQLSf4bVukO5OUXviSujW1gUtM1NTroTz3lPsXy7EcKxIp8ZzX5g/viewform

[2]: http://www.mdpi.com/2220-9964/1/3/315

[3]: https://github.com/willemarcel/osmcha-django

[4]: https://github.com/ethan-nelson/osm_hall_monitor

_______________________________________________
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev

Reply via email to