On Wed, 2012-10-17 at 00:28 +0100, Tom Hughes wrote: > On 17/10/12 00:04, Alex Barth wrote: > > - Are there technical reasons why changesets should tend to be > > large? Are they expensive on some level? > > I believe it's entirely because we've got so many people doing > mechanical or semi-mechanical edits. > > That includes bots but also things like people using xapi or overpass to > download all objects matching some set of tags, then change those tags > and reupload.
the historical answer to this is that when changesets were added to the OSM API there were two different intentions for their use which got conflated: first, that changesets were structures for grouping edits sharing common attributes. and second, that changesets were VCS-style 'commits' which would be uploaded in a single request and applied atomically. effectively, the first use-case was for users, and tried to make changesets as open-ended as possible. from this, we get tags on changesets for comments, editor, bot-ness, etc... and the ability to keep uploading into an open changeset. the second use-case was a technical thing - the sheer number of API calls to individual elements, even from normal-sized editing sessions, could cause problems. and, for small calls, HTTP headers and round-trip latencies would dominate the cost of an upload. further, editors had to cope with the situation where an upload failed half-way through and to re-try the failed calls. from this, we get a single changeset/#id/upload call which applies atomically. at the time, this seemed like a good way to satisfy both use-cases. and, while it does what it set out to, i think we should consider splitting these in the next API version; explicitly reifying uploads at which bboxes / coverage sets and change counts can be stored. changesets can then simply be collections of uploads. getting to the point: this might to some extent mitigate the "large changesets" issue, as it would allow bboxes to be collected at a smaller granularity. however, it wouldn't be a full solution and we'd probably still need something like OWL to break down the geographic footprint of changesets further. cheers, matt _______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev