> > [@newer=2011-08-01] > > restricts the data to only those data last edited after the given > > date. This is only possible in combination with another conditional. > > Why wasn't something like [@timestamp>2011-08-01] used?
Well, at first, because there is no [@older=..] or [@timestamp<..] . By the way, to correct myself, it should be something like [@newer=2011-08-01T09:15:00Z], because the simplistic parser needs a full date. The [@older=..] in turn doesn't exist, because it doesn't make sense in the current state of affairs. See below. [@newer=..] shall have a humble look and feel, because it is a very humble solution. To spark the discussion, imagine the following data items: - Element A hasn't changed since January. - Element B has been re-uploaded without changes yesterday. - Element C has been deleted yesterday. - Element D has substantially changed its meaning yesterday such that it is now out of scope. (To make this clearer: think e.g. you are searching for bridges and D is a way that has been split such that one half is no longer a bridge.) - Element E has been created yesterday morning but then disappeared yesterday evening, due to a bad edit. Now what would you expect from the [@newer=24-hours-ago] operator to find? A good implementation should produce surely C and D, maybe E and/or B. Actually, Overpass API yields B and only B, which is unsatisfactory. The reason for this: At the moment, Overpass API represents at any moment the data as it would be visible in a fictional Planet.osm, patched from the last Planet.osm by the diff files applied so far. And a Planet.osm would also only show B. This is closely related to discussions of the type "How do I find deleted elements?". What are other options? (Please mind: All of them are thoughts, not even vapour ware) Overpass API could become a full blown history server. This would allow to give sane answers to [@timestamp<..], [@timestamp>..], and a create a diff option, produce search results from a certain time in the past and a lot more of amazing thing. But this has at least four downsides: 1. This partly breaks with the OSM data model. For example, a way can change its geometry without getting literally changed itself: just move the underlying nodes. The OSM element doesn't recognize this, the database must recognize this for consistent data delivery. The way might for example have entered or left the bounding box you have searched for. The issue popped up at some time in a discussion about the undo features of Potlatch, but I don't have a link to that. Another thing is that such a server would mix CC-BY-SA and ODBL data at a certain time in the future, which is a unnecessary legal hassle. Most likely, I would be slow enough on development to roll out the software after the license change :) In any case, this may produce a flame war on details, which I exactly don't want to get the project into, and I'm not diplomat enough to avoid this. 2. Hourly, daily, weekly diffs are incompatible, and even the Planet.osm and minute updates need diligent analysis. Note that in all the diffs, multiple changes will be collapsed into a single diff. Thus, an element like E above might never appear on the server. I'm not sure whether the minute updates are guaranteed to contain all changes, but changes reverted in less than a minute might be acceptable to lose. The full history Planet.osm could replace an ordinary Planet.osm, but mind that it is an order of magnitude bigger. 3. This all could multiply the hardware requirements. I'm simply not sure what the current server can handle. For this reason, I started with the Planet.osm meta data, which already doubled the data amount from roughly 35 GB to roughly 65 GB. With history data, I expect rather 100 GB to 150 GB. The impact on the query times can probably be kept under control (if we keep the historic data apart from the current data), but the data updates in that case will become much slower. 4. And it will need a lot of programming effort. Just alone the necessary documentation to make clear all of the decisions necessary in point 1. and 2. takes weeks. Implementation and testing will be the same or even more effort, depending on how much tricks are necessary to keep the system responsive. While it is challenging, I don't see the massive demand in comparison to other features that would be postponed in that case. A second option would be to produce some kind of feed, where you need to subscribe to get changes. This can be realized quite easily, because the way and relation updater already receive some kind of internal feed to update their geometries from their members. So you subscribe with an arbitrary query, e.g. a bounding box or a certain tag or a combination of both, and get all changes concerning that query roughly every few hours, without the assertion of being complete. But I don't expect that much users would be interested in such a service [Responses by e-Mail may me convince of the contrary :) ]. It will at some point in the future been implemented to improve area update speed, but this is rather at the end of this year on the road map. The third option would be to regularly freeze the data. This is technically easily possible (just freeze the updates at a certain point and copy the data base), but out of scope with regard to the hard disk sizes on the overpass- api.de server. Other ideas how to give [@newer=..] proper semantics, comments on the above ideas and personal options on the usefulness are welcome. Cheers, Roland _______________________________________________ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk