Hi,

On 04.08.2012 07:01, Martijn van Exel wrote:
I produced this small osmchange file using a recently compiled
osmconvert: https://gist.github.com/3254628
the osmosis --rxc task chokes on it though: https://gist.github.com/3254632

Ideas?
yes, the cause of this is pretty simple. Osmosis requires lat and lon attributes on nodes, and these attributes are not set in the change file on the deleted nodes.

While the cause is pretty simple, the fix for this (_if_ we want to fix this) isn't going to be simple. The abscence of the lat/lon attributes on the deleted nodes is conceptually OK since it is going to be ignored anyway. However, the <node> parser (NodeElementProcessor to be precise) does not know about the context of the <node> element - and the osmChange <delete> is the only case where lat/lon is actually optional.

Also, the Osmosis data model (the Node class) has an implicit assumption that the nodes always have lat/lon - the fields are typed as double and can't thus be set to null, and there is no such thing like Node.isPositionDefined().

There also was a recent related API change which makes lat/lon optional for already deleted (visible="false") nodes, so now seems a good point to talk about that too.

So what can be done about this:

1. Make the lat/lon attributes optional and change the data model accordingly. 2. Make the lat/lon attributes optional everywhere and don't change the data model. 3. Teach the NodeElementProcessor about context it is called from and make it generate some bogus and/or sentinel values for the Node classes in that case (0, Double.NaN, Double.MAX_VALUE, whatever).
4. Ignore the whole issue and just say that lat/lon is required.

My personal and also very humble opinion about those:

1.: For this, we would need to change all the downstream tasks to check if the position is actually there. Given that the correct absence of lat/lon is a quite rare case, this seems rather costly. 2. and 3.: These are actually almost the same thing. If only the parser thinks lat/lon are optional and the downstream tasks think it's always there, the parser needs to put some value in the node for Node.getLatitude() to return.

The only difference between 2 and 3 is the amount of validation done in the parser (and therefore the assumptions the downstream tasks can have about the data). I think it's better validate in the parser and fail fast. Yes, bringing context to NodeElementProcessor is a bit ugly - but hey, it's deep down in the implementation classes and nobody sees it. If the input is actually broken (normal OSM XML with lon/lat not set), it's better to fail in the parser than pretend that it's OK leading to undefined behavior downstream.

4.: Well, we _could_ ignore osmconvert producing those files (no offense intended, Markus) and argue that it's a bug in osmconvert and it should be fixed on that side and so on. But I don't think it is wise to ignore the API... So this is not an option, despite having a very attractive implementation cost of zero :)

To sum it up, I tend to prefer 3.

Other thoughts on this point, anyone? :)

Greetings from Stuttgart
Igor

_______________________________________________
osmosis-dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/osmosis-dev

Reply via email to