Re: Order in JOSM files
On Tue, Aug 08, 2017 at 11:32:45AM +0200, Dirk Stöcker wrote: > On Tue, 8 Aug 2017, Jochen Topf wrote: > > > I would urge you to keep the order because that makes many things much > > more efficient. For instance checking whether a file contains an object > > twice is trivial when there is a known order but very expensive without. > > My code for assembling multipolygons for instance needs to make sure > > that IDs aren't in a file twice and it uses this very efficient way > > instead of creating much more complex data structures which would need > > more RAM and make everything slower. > > In general we will not make any changes only to do changes. So the chance > that element order stays fixed as it is now is high. I'm not aware that it > changed in the past. > > BUT: You asked if you can rely on this. And the answer to this is that this > cannot be guaranteed. :-) > > But you simply can ignore this and hope that JOSM continues to produce nice > files and chances are high your hope will be fulfilled. Okay, good enough. :-) At least now that you are aware of this issue, it will not change just by accident. I thought some more about this and what it comes down to is this: An OSM file can either be totally unordered, so the generator doesn't provide any "guarantees" to its ordering. Or it can be ordered in some way. How exactly it is ordered doesn't matter that much probably, what matters is that there is some kind of consistency others can rely on. Although nobody ever guaranteed it, OSM files are almost always sorted nodes, ways, relations and each object type by ID, so this is what people rely on and this is the order that "sort" commands (like the one from osmium or osmosis) will create. I want to extend this to: If you are using negative IDs in your OSM file you should order them negative IDs first, then positive IDs, both ordered by absolute values of those IDs. That's the format I will optimize my software for and that's the format "osmium sort" will create in the future. Then everything fits together with the most important generator of files with negative IDs, JOSM. Jochen -- Jochen Topf joc...@remote.org https://www.jochentopf.com/ +49-351-31778688
Re: Order in JOSM files
On Tue, 8 Aug 2017, Jochen Topf wrote: I would urge you to keep the order because that makes many things much more efficient. For instance checking whether a file contains an object twice is trivial when there is a known order but very expensive without. My code for assembling multipolygons for instance needs to make sure that IDs aren't in a file twice and it uses this very efficient way instead of creating much more complex data structures which would need more RAM and make everything slower. In general we will not make any changes only to do changes. So the chance that element order stays fixed as it is now is high. I'm not aware that it changed in the past. BUT: You asked if you can rely on this. And the answer to this is that this cannot be guaranteed. :-) But you simply can ignore this and hope that JOSM continues to produce nice files and chances are high your hope will be fulfilled. Ciao -- http://www.dstoecker.eu/ (PGP key available)
Re: Order in JOSM files
Hello, I rely on the ordering too. This is a prerequisite to provide readable patches to boundaries.osm, see: https://josm.openstreetmap.de/ticket/14833 https://josm.openstreetmap.de/ticket/15036 The other prerequisite is to keep stable ids. I have a working patch in #14833 but not yet submitted as I must test it extensively before changing this crucial part of JOSM. 2017-08-08 11:07 GMT+02:00 Jochen Topf: > On Tue, Aug 08, 2017 at 10:51:36AM +0200, Simon Poole wrote: > > And another data point: the implementation in Vespucci does not sort by > > id (not only in theory, the output is really not ordered, which doesn't > > cause issues with JOSM). > > > > Or put differently: if that becomes a requirement, it would be a good > > idea to versionize the format (which naturally wouldn't actually solve > > your issue). > > For me the files ceated by JOSM are important, not what JOSM reads. > Having JOSM create something that is stricter than what it can read is > perfectly backwards compatible. But I'd encourage you to also use the > same kind of ordering in Vespucci, see my other mail for reasons. > > Not sure versioning would help us here. What would help is some kind of > flag in the header of the file that tells us if the file is ordered and > in what way. Than any software can directly see whether it can work with > the file and, possibly, which algorithm to use. > > Jochen > -- > Jochen Topf joc...@remote.org https://www.jochentopf.com/ > +49-351-31778688 > >
Re: Order in JOSM files
On Tue, Aug 08, 2017 at 10:51:36AM +0200, Simon Poole wrote: > And another data point: the implementation in Vespucci does not sort by > id (not only in theory, the output is really not ordered, which doesn't > cause issues with JOSM). > > Or put differently: if that becomes a requirement, it would be a good > idea to versionize the format (which naturally wouldn't actually solve > your issue). For me the files ceated by JOSM are important, not what JOSM reads. Having JOSM create something that is stricter than what it can read is perfectly backwards compatible. But I'd encourage you to also use the same kind of ordering in Vespucci, see my other mail for reasons. Not sure versioning would help us here. What would help is some kind of flag in the header of the file that tells us if the file is ordered and in what way. Than any software can directly see whether it can work with the file and, possibly, which algorithm to use. Jochen -- Jochen Topf joc...@remote.org https://www.jochentopf.com/ +49-351-31778688
Re: Order in JOSM files
On Tue, Aug 08, 2017 at 10:43:22AM +0200, Dirk Stöcker wrote: > On Tue, 8 Aug 2017, Jochen Topf wrote: > > > When JOSM saves OSM files it uses a particular order: First nodes, then > > ways, then relations as usual. For each object type it first writes out > > objects with negative IDs (ie objects that are not uploaded yet), then > > objects with positive IDs, both are ordered by absolute value. > > > > Is this something I can rely on or is this just something that happened > > accidentally with my version of JOSM when I tried this? > > > > The reason I am asking: I sometimes get requests for Osmium features > > from people who want to do something with files saved from JOSM, like > > renumber them to have only small positive IDs, or convert them into > > other formats. Osmium can read JOSM files and handle negative IDs, so > > these things mostly work, but in some cases having a known order helps > > (or is even necessary for correct functioning). I am currently working > > on some things there but if JOSM would not keep to this order in the > > future they would break again. > > I would not rely on the order of the individual elements. There are ideas to > rework the data storage to prevent changing IDs for the new objects (allows > better diffs). That may have other side effects as well. I would expect the > only thing you can rely on is the nodes, ways, relation order. I would urge you to keep the order because that makes many things much more efficient. For instance checking whether a file contains an object twice is trivial when there is a known order but very expensive without. My code for assembling multipolygons for instance needs to make sure that IDs aren't in a file twice and it uses this very efficient way instead of creating much more complex data structures which would need more RAM and make everything slower. OSM planet files and the usual extracts as provided by Geofabrik and others always have objects ordered by ID so that's what a lot of programs rely on anyway. This is not going to change. JOSM files are special because of the negative IDs used. Having a consistent order for the negative IDs, too, would make it easier for users here, because they can just use such a file and don't have to sort it first. Some devs use JOSM to generate tests for their software, for instance, and being able to directly use the JOSM files makes things easier for them, too. Jochen -- Jochen Topf joc...@remote.org https://www.jochentopf.com/ +49-351-31778688
Re: Order in JOSM files
And another data point: the implementation in Vespucci does not sort by id (not only in theory, the output is really not ordered, which doesn't cause issues with JOSM). Or put differently: if that becomes a requirement, it would be a good idea to versionize the format (which naturally wouldn't actually solve your issue). Simon Am 08.08.2017 um 10:24 schrieb Jochen Topf: > Hi! > > When JOSM saves OSM files it uses a particular order: First nodes, then > ways, then relations as usual. For each object type it first writes out > objects with negative IDs (ie objects that are not uploaded yet), then > objects with positive IDs, both are ordered by absolute value. > > Is this something I can rely on or is this just something that happened > accidentally with my version of JOSM when I tried this? > > The reason I am asking: I sometimes get requests for Osmium features > from people who want to do something with files saved from JOSM, like > renumber them to have only small positive IDs, or convert them into > other formats. Osmium can read JOSM files and handle negative IDs, so > these things mostly work, but in some cases having a known order helps > (or is even necessary for correct functioning). I am currently working > on some things there but if JOSM would not keep to this order in the > future they would break again. > > Jochen signature.asc Description: OpenPGP digital signature
Re: Order in JOSM files
On Tue, 8 Aug 2017, Jochen Topf wrote: When JOSM saves OSM files it uses a particular order: First nodes, then ways, then relations as usual. For each object type it first writes out objects with negative IDs (ie objects that are not uploaded yet), then objects with positive IDs, both are ordered by absolute value. Is this something I can rely on or is this just something that happened accidentally with my version of JOSM when I tried this? The reason I am asking: I sometimes get requests for Osmium features from people who want to do something with files saved from JOSM, like renumber them to have only small positive IDs, or convert them into other formats. Osmium can read JOSM files and handle negative IDs, so these things mostly work, but in some cases having a known order helps (or is even necessary for correct functioning). I am currently working on some things there but if JOSM would not keep to this order in the future they would break again. I would not rely on the order of the individual elements. There are ideas to rework the data storage to prevent changing IDs for the new objects (allows better diffs). That may have other side effects as well. I would expect the only thing you can rely on is the nodes, ways, relation order. Ciao -- http://www.dstoecker.eu/ (PGP key available)