Am 3. April 2012 20:02 schrieb Paul Norman <[email protected]>: > The problem with detecting when changesets are closed is that there is no > way to determine exactly when they are closed short of an API query. You > can fake it by assuming changesets are closed an hour after the last change > to them and 24 hours after the first change to them. >
Open: (http://www.openstreetmap.org/api/0.6/changeset/11187430) <osm version="0.6" generator="OpenStreetMap server"> <changeset id="11187430" user="regedi" uid="645826" created_at=" 2012-04-05T10:28:21Z" open="true" min_lat="50.0106489" min_lon="36.3515771" max_lat="50.0112144" max_lon="36.3586195"> <tag k="created_by" v="Potlatch 2"/> <tag k="build" v="2.3-375-g9f05171"/> <tag k="version" v="2.3"/> </changeset> </osm> Closed: (http://www.openstreetmap.org/api/0.6/changeset/11167430) <osm version="0.6" generator="OpenStreetMap server"> <changeset id="11167430" user="bergfrei" uid="327035" created_at=" 2012-03-31T15:11:30Z" closed_at="2012-03-31T15:16:55Z" open="false" min_lat ="47.9912789" min_lon="9.7206276" max_lat="48.0492344"max_lon="9.8521079"> <tag k="comment" v="Hochdorf Ausgleich Luftbildversatz"/> <tag k="created_by" v="JOSM/1.5 (5047 de)"/> </changeset> </osm> Or have I missed something? > It is better to detect problems when they occur, not up to 24 hours after > they’ve occurred. > That's correct. A good practise would be, to code it as abstract as possible and so only parse modify/delete/create sets. The origin (minute/hour-diff/changeset) will be ignored. I try to take this into account in my proposal. Thanks for all of your ideas! It's time to finish my proposal :) Regards, Morris > **** > > ** ** > > *From:* kabum [mailto:[email protected]] > *Sent:* Tuesday, April 03, 2012 2:20 AM > *To:* Derick Rethans > *Cc:* OpenStreetMap dev list > > *Subject:* Re: [OSM-dev] Google Summer of Code**** > > ** ** > > Hi,**** > > ** ** > > Am 2. April 2012 22:20 schrieb Paul Norman <[email protected]>:**** > > A tool that operates on the changeset level is > https://github.com/pnorman/osm-weirdness**** > > It detects changesets that have a high probability of being an import or > mechanical edit. The detection is pretty crude but it does find a fair > number of undocumented imports, mechanical edits, and other weirdness. If > you point it an old state.txt file it will start in the past and work up to > the present.**** > > ** ** > > I've a look later this day on your script.**** > > **** > > When working with the minutely diffs there are some limitations:**** > > Limited knowledge of changesets. In practice, if you start your detection > an hour in the past you can have a list of all open changesets, but it is > not possible to know the tags of the changesets.**** > > No knowledge of the previous state of objects. You know where deleted > objects were, but you can’t tell how far an object is moved or what it’s > tags were before. To tell this you need to query a service with a full > history DB, and handling full history files is difficult.**** > > No knowledge of way geometry if using existing nodes. Iandees’ > https://github.com/pnorman/osm-weirdness/tree/way_check solves this by > fetching nodes in a way that aren’t also in the changeset from jxapi and it > can then detect bad geometry (e.g. ways that trace over themselves)**** > > **** > > If you were to code a vandalism detection tool I think it should work on > the minutely replication diffs ( > http://wiki.openstreetmap.org/wiki/Planet.osm/diffs)**** > > ** ** > > I thought about analyse the data after the changeset is closed, but this > diffs sounds also good. I will check this way :) Thanks!**** > > **** > > **** > > Am 3. April 2012 09:38 schrieb Derick Rethans <[email protected]>:**** > > On Mon, 2 Apr 2012, kabum wrote: > > > Result: > > - each changeset has a total rating -> use a treshold value to divide > them > > into suspicious and not suspicious**** > > Instead of just using static thresholds, I think that something like SVM > (http://en.wikipedia.org/wiki/Support_vector_machine) might be highly > benificial here; and it's another cool technology to play with. There is > a cool library for this (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) and > I know there is at least an extension to use it from PHP: > http://phpir.com/support-vector-machines-in-php**** > > ** ** > > Thanks for this method ... seems to be very suitable for our use case.**** > > ** ** > > I've already some years of experience of PHP, but I wouldn't prefer it for > this part of the project. I thought about Python (libsvm has native Python > bindings ;)) **** > > ** ** > > ** ** > > ** ** > > > > Some questions came up within this preparation: > > - Is there a prefered language? Has this to be specified within the > > proposal? (language skill has to be rated, so I would decide this during > > the project phase)**** > > Not really any preferred language. What did you have in mind? For the > front end I was thinking PHP, but the engine, I wouldn't know. I think > something high performant (so C or C++) might be benificial.**** > > ** ** > > > My thoughts were that it's easy to setup and it's capable to call it easy > from a terminal or to include it in other python scripts (i.e. web > frontend).**** > > ** ** > > If C++ is necessary, because of it's speed, then I think I could master > this. In the passed semester I participated in a software engineering > partical training at university (in a team of five fellow students), where > we have an extensive use of C++ (https://github.com/brainafk/Empire).**** > > **** > > > > - I also would like to discuss used libraries and framework within the > > project phase, or should I decide this also in my proposal? > > - Should the frontend integrate in the current website (ruby on rails > > project) or should this just be an optional feature?**** > > I think it can easily live as it's own website.**** > > ** ** > > Ok :)**** > > **** > > > > - How detailed should be the proposal? Is it enough to formulate this > draft?**** > > That's a tricky one, the more information you provide the better I > think, as it shows you have thought about it :-)**** > > ** ** > > I think it grows a lot by this discussion and I try to be as detailed as > possible. :)**** > > ** ** > > Thanks for the response :)**** > > ** ** > > Regards,**** > > Morris**** >
_______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

