On Tue, Aug 08, 2017 at 10:43:22AM +0200, Dirk Stöcker wrote:
> On Tue, 8 Aug 2017, Jochen Topf wrote:
> > When JOSM saves OSM files it uses a particular order: First nodes, then
> > ways, then relations as usual. For each object type it first writes out
> > objects with negative IDs (ie objects that are not uploaded yet), then
> > objects with positive IDs, both are ordered by absolute value.
> > Is this something I can rely on or is this just something that happened
> > accidentally with my version of JOSM when I tried this?
> > The reason I am asking: I sometimes get requests for Osmium features
> > from people who want to do something with files saved from JOSM, like
> > renumber them to have only small positive IDs, or convert them into
> > other formats. Osmium can read JOSM files and handle negative IDs, so
> > these things mostly work, but in some cases having a known order helps
> > (or is even necessary for correct functioning). I am currently working
> > on some things there but if JOSM would not keep to this order in the
> > future they would break again.
> I would not rely on the order of the individual elements. There are ideas to
> rework the data storage to prevent changing IDs for the new objects (allows
> better diffs). That may have other side effects as well. I would expect the
> only thing you can rely on is the nodes, ways, relation order.
I would urge you to keep the order because that makes many things much
more efficient. For instance checking whether a file contains an object
twice is trivial when there is a known order but very expensive without.
My code for assembling multipolygons for instance needs to make sure
that IDs aren't in a file twice and it uses this very efficient way
instead of creating much more complex data structures which would need
more RAM and make everything slower.
OSM planet files and the usual extracts as provided by Geofabrik and
others always have objects ordered by ID so that's what a lot of
programs rely on anyway. This is not going to change. JOSM files are
special because of the negative IDs used. Having a consistent order for
the negative IDs, too, would make it easier for users here, because they
can just use such a file and don't have to sort it first. Some devs use
JOSM to generate tests for their software, for instance, and being able
to directly use the JOSM files makes things easier for them, too.
Jochen Topf joc...@remote.org https://www.jochentopf.com/ +49-351-31778688