Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Richard Fairhurst

Stefan de Konink wrote:
 Now how does this thing actually go?
 --- add tags to existing way
 --- add nodes to existing way
 --- create way in mysql

Right, now we're getting somewhere, now you're talking in specifics. What
I'm still struggling with is correlating what you're describing here with
anything that actually happens. Could you point to lines in the source which
correspond to these actual stages, please?

Richard
-- 
View this message in context: 
http://www.nabble.com/Re%3A--OSM-talk--Handling-of-towns-with-different-or-alternative%09names-tp21697397p21702729.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Dave Stubbs
2009/1/28 Tom Hughes t...@compton.nu:
 Simon Ward wrote:
 On Wed, Jan 28, 2009 at 12:30:01AM +, Tom Hughes wrote:
 In practice keys are unique because although the API has never enforced
 uniqueness pretty much every client does because all the clients use a
 hash table of some sort to store tags.

 Hash table, or associative array/hash/dictionary?  Hash tables have
 mechanisms to deal with collisions.  I suppose I should just look at the
 code…

 Yes, OK, I was being a little imprecise. I was referring to the kind of
 associative array/hash/dictionary that is exposed to the user in many
 languages and which commonly does not allow duplicates.



And in Java is called Hashtable :-)

(although everybody uses HashMap these days)

Dave

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Stefan de Konink
Richard Fairhurst wrote:
 Stefan de Konink wrote:
 Now how does this thing actually go?
 --- add tags to existing way
 --- add nodes to existing way
 --- create way in mysql
 
 Right, now we're getting somewhere, now you're talking in specifics. What
 I'm still struggling with is correlating what you're describing here with
 anything that actually happens. Could you point to lines in the source which
 correspond to these actual stages, please?

I cannot because this is my 'backtrace' of the event that is the most 
plausible in my head and a scenario that fits the events perfectly. Also 
shows why the diffs would be ignored.

If you can agree that this tag/nds out of order thing could be the main 
reason for these strange inserts, I am happy to help you search for the 
source that reflects this.

For some reason I think that the actual API call to update a way (hence: 
new timestamp/user) *is* done first from a code perspective, but is not 
finished when the request is done for current tables vs history tables. 
If you say 'but i create an xml and PUT it too' /way/id like anything 
else, then that is very interesting situation and you maybe right we are 
looking for the wrong editor.


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Richard Fairhurst

Stefan de Konink wrote:
 If you can agree that this tag/nds out of order thing could be the 
 main reason for these strange inserts, I am happy to help you 
 search for the source that reflects this.

I have an open mind as to what it might be.

 For some reason I think that the actual API call to update a way 
 (hence: new timestamp/user) *is* done first from a code 
 perspective, but is not finished when the request is done for 
 current tables vs history tables. If you say 'but i create an xml 
 and PUT it too' /way/id like anything else, then that is very 
 interesting situation and you maybe right we are looking for the 
 wrong editor.

For its transport format, Potlatch uses AMF rather than XML. Speed is vital
in an online editor, like Potlatch (or, indeed, the original applet). When
using Flash, AMF both saves bandwidth and serialisation/deserialisation
time.
http://www.jamesward.com/blog/2007/04/30/ajax-and-flex-data-loading-benchmarks/
will give you some benchmarks. (Plus, of course, it doesn't suffer from the
memory leaks that we seem to have with Ruby's XML handling.)

amf_controller.rb then takes these AMF messages, and calls exactly the same
Rails methods as the XML API does.

I would strongly recommend you look at the code that saves ways:
  
http://trac.openstreetmap.org/browser/sites/rails_port/app/controllers/amf_controller.rb#L330

FWIW, my experience is that data inconsistencies of this sort happen mostly
when the server is under very heavy load. If a process is killed halfway
through a write operation, then obviously you're going to get some sort of
inconsistency. You can order the operations so that this is less likely (cf
http://trac.openstreetmap.org/changeset/13184/), but as has been said here
extensively, this kind of stuff is always going to happen without
transactions.

Richard
-- 
View this message in context: 
http://www.nabble.com/Re%3A--OSM-talk--Handling-of-towns-with-different-or-alternative%09names-tp21697397p21705144.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Richard Fairhurst
Stefan de Konink wrote:

 1) I don't know anything about ruby so don't laugh

I promise. :)

 2) where are the save_with_history things defined?

http://trac.openstreetmap.org/browser/sites/rails_port/app/models/way.rb#L161

Same as what the XML API uses.

Richard


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Stefan de Konink
Richard Fairhurst wrote:
 Stefan de Konink wrote:
 
 1) I don't know anything about ruby so don't laugh
 
 I promise. :)
 
 2) where are the save_with_history things defined?
 
 http://trac.openstreetmap.org/browser/sites/rails_port/app/models/way.rb#L161
 
 Same as what the XML API uses.

164 Way.transaction do
165   self.timestamp = t
166   self.save!
167 end


Does that block?


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Martin Koppenhoefer
2009/1/28 Richard Fairhurst rich...@systemed.net


 FWIW, my experience is that data inconsistencies of this sort happen mostly
 when the server is under very heavy load. If a process is killed halfway
 through a write operation, then obviously you're going to get some sort of
 inconsistency. You can order the operations so that this is less likely (cf
 http://trac.openstreetmap.org/changeset/13184/), but as has been said here
 extensively, this kind of stuff is always going to happen without
 transactions.


but this is a big problem as the server is quite often under heavy load.
Isn't it possible to transfer lets say to a temporary place, and when the
transaction is completed, tell the server to copy from temp to real? Is
there any feedback the server gives to the client, whether a save operation
was succesful?

Martin
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Stefan de Konink
Martin Koppenhoefer wrote:
 
 
 2009/1/28 Richard Fairhurst rich...@systemed.net 
 mailto:rich...@systemed.net
 
 
 FWIW, my experience is that data inconsistencies of this sort happen
 mostly
 when the server is under very heavy load. If a process is killed halfway
 through a write operation, then obviously you're going to get some
 sort of
 inconsistency. You can order the operations so that this is less
 likely (cf
 http://trac.openstreetmap.org/changeset/13184/), but as has been
 said here
 extensively, this kind of stuff is always going to happen without
 transactions.
 
 
 but this is a big problem as the server is quite often under heavy load. 
 Isn't it possible to transfer lets say to a temporary place, and when 
 the transaction is completed, tell the server to copy from temp to real? 
 Is there any feedback the server gives to the client, whether a save 
 operation was succesful?

The point happens on modification, so I would say yes, this is possible. 
But for creation it is never possible (hence an id should be returned).


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Dave Stubbs
2009/1/28 Richard Fairhurst rich...@systemed.net:

 Stefan de Konink wrote:
 If you can agree that this tag/nds out of order thing could be the
 main reason for these strange inserts, I am happy to help you
 search for the source that reflects this.

 I have an open mind as to what it might be.

 For some reason I think that the actual API call to update a way
 (hence: new timestamp/user) *is* done first from a code
 perspective, but is not finished when the request is done for
 current tables vs history tables. If you say 'but i create an xml
 and PUT it too' /way/id like anything else, then that is very
 interesting situation and you maybe right we are looking for the
 wrong editor.

 For its transport format, Potlatch uses AMF rather than XML. Speed is vital
 in an online editor, like Potlatch (or, indeed, the original applet). When
 using Flash, AMF both saves bandwidth and serialisation/deserialisation
 time.
 http://www.jamesward.com/blog/2007/04/30/ajax-and-flex-data-loading-benchmarks/
 will give you some benchmarks. (Plus, of course, it doesn't suffer from the
 memory leaks that we seem to have with Ruby's XML handling.)

 amf_controller.rb then takes these AMF messages, and calls exactly the same
 Rails methods as the XML API does.

 I would strongly recommend you look at the code that saves ways:

 http://trac.openstreetmap.org/browser/sites/rails_port/app/controllers/amf_controller.rb#L330

 FWIW, my experience is that data inconsistencies of this sort happen mostly
 when the server is under very heavy load. If a process is killed halfway
 through a write operation, then obviously you're going to get some sort of
 inconsistency. You can order the operations so that this is less likely (cf
 http://trac.openstreetmap.org/changeset/13184/), but as has been said here
 extensively, this kind of stuff is always going to happen without
 transactions.



I've just done some research on this.

Basically, the short version is that OldWay.save_with_dependencies!
which does the save of the history has some hacks to get round rails
being a bit shit if you don't have a unique column as your index. This
means that you can't save two or more versions of a way with the same
timestamp (which only has 1 second resolution); if you do then it
results in duplicate tags on the first version, and an empty entry for
the second version. The second update will throw an error as the
way_node primary key constraint on is violated (i've tested with
Potlatch on a dev env and the error icon does popup), but by that
point the tags are already written.

This all matches the history output -- however the current way output
shouldn't be affected afaict.
Stefan: before you correct these, is the data as returned by
http://www.openstreetmap.org/api/0.5/way/ correct? I know the
/history is broken, and therefore the osmosis generated diffs will be
broken too.

You can achieve the same results with the standard API (I have tested
and shown this to be true).
I think it's just that it's very very rare as other editors don't have
the same approach/frequency to saving ways. At least, that's my
interpretation, and explains why these things are more likely to
happen during heavy load as update requests can get queued behind map
calls, and then all get processed in the same second with the same
timestamp. And it also gets Potlatch/amf_controller off the hook, at
least directly.


To mitigate: save up writes in potlatch... try not to send any updates
with others already pending (that should happen already anyway)

To fix: OldWay needs an alternative hack -- I can't think of one right
now that doesn't involve adding another column which would use quite a
lot of space for very little reaason.


Dave

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Stefan de Konink
Dave Stubbs wrote:
 Stefan: before you correct these, is the data as returned by
 http://www.openstreetmap.org/api/0.5/way/ correct? I know the
 /history is broken, and therefore the osmosis generated diffs will be
 broken too.

The data in /way/ returns the 'correct' way, including the 
'duplicate' k/v pairs.


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-28 Thread Richard Fairhurst

Dave Stubbs wrote:
 I've just done some research on this.

*applauds*

Really interesting. 

 To mitigate: save up writes in potlatch... try not to send any 
 updates with others already pending (that should happen 
 already anyway)

Indeed it should: Potlatch expressly won't upload a way until any pending
uploads of that way return.

I guess this is a plausible scenario:

1. User draws new way A, deselects
2. [Potlatch sends way A to upload]
3. User draws new way B, branching off a new node inserted into way A
4. [Potlatch queues new version of way A, because it can't upload it until 2
returns]
5. [Potlatch queues way B, because it doesn't know the 'branching' node ID
until 4 returns]
6. [Big blocking operation finishes on server; way A written to db,
returning success code to Potlatch]
7. [Potlatch immediately despatches new upload of way A from 4]
8. [New upload executes in same second as 6, triggering bug in Rails code as
described]

For the benefit of others - consensus on IRC seems to be that the
requirement for version numbers in 0.6 may mean we can remove the code that
causes the bug.

cheers
Richard
-- 
View this message in context: 
http://www.nabble.com/Re%3A--OSM-talk--Handling-of-towns-with-different-or-alternative%09names-tp21697397p21708991.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Simon Ward
[Moved to dev; followups to dev]

On Tue, Jan 27, 2009 at 10:58:25PM +, Ævar Arnfjörð Bjarmason wrote:
  I think multiple keys with the same name should be allowed for a
  node/way/relation.  AFAIK it's only the editors that don't currently let
  you do this.
 
 Yes, the API and data format supports it, but only for another 2
 months or so until we switch to 0.6 where it won't be allowed.

Oh, d’oh!

 And -- to do some drive-by bikeshedding --

My turn…

 I think that leaves us with an unoptimal situation where editors
 either have to shove things into the same key delimited by some token
 like ; as is currently recommended but AFAIK not supported by any
 renderer (or any tool?), or to put what's logically the same data
 under different keys.

Is using different keys (when they’re likely to be unknown to the
renderers) or multiple tags with the same key name any better supported
in the renderers?

 Although the DB argument of having keys be primary keys is certainly
 understandable.

Ok, pulling up the wiki page on API 0.6[1] I see that the plan is “to
create an unique index on the combination of object id and tag key”.
Presumably this is mainly for performance.  An index doesn’t have to be
unique though—is this a limitation of MySQL or is there some other
reason to enforce uniqueness?

In a reasonable index implementation the impact of it being non‐unique
should be negligible, especially if the collision case is rare as is the
case for duplicate key names in OSM.

[1]: http://wiki.openstreetmap.org/wiki/API_0.6#Related_database_improvements
-- 
A complex system that works is invariably found to have evolved from a
simple system that works.—John Gall


signature.asc
Description: Digital signature
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Tom Hughes
Simon Ward wrote:

 Ok, pulling up the wiki page on API 0.6[1] I see that the plan is “to
 create an unique index on the combination of object id and tag key”.
 Presumably this is mainly for performance.  An index doesn’t have to be
 unique though—is this a limitation of MySQL or is there some other
 reason to enforce uniqueness?

The main reason is that rails works much better if all objects have 
unique primary keys.

In practice keys are unique because although the API has never enforced 
uniqueness pretty much every client does because all the clients use a 
hash table of some sort to store tags.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Ævar Arnfjörð Bjarmason
On Wed, Jan 28, 2009 at 12:18 AM, Stefan de Konink ste...@konink.de wrote:
 Simon Ward wrote:
 I think that leaves us with an unoptimal situation where editors
 either have to shove things into the same key delimited by some token
 like ; as is currently recommended but AFAIK not supported by any
 renderer (or any tool?), or to put what's logically the same data
 under different keys.

 Is using different keys (when they're likely to be unknown to the
 renderers) or multiple tags with the same key name any better supported
 in the renderers?

 The most logical way of tagging is still using the xml namespace idea.
 name:en,name:nl etc. This is used for the different Dutch layers and is
 clean. If someone wants to render his own map he will prioritize the
 name:whatever over name, and creates his fancy map.

Perhaps for this specific use case, but we're still left with the
problem where you genuinely want to stick two distinct things into one
tag, like when multiple values of Key:shop or Key:amenity are
applicable, the current solutions to that are:

1. Create two nodes/ways/* covering the same area and tag them differently
2. Use multiple keys where one key would logically make sense
3. Put different values delimited by ; which is recommended but
supported by no-one so users usually use [1] or [2], and even if it
were supported users of the data would end up parsing user-generated
XML/DB values which isn't a good way to structure XML or a DB from a
query perspective.

But just to be clear there are probably good reasons for the
current/future data model unknown to me, but it's still a crappy
situation as far as the end user editor or parser of the data is
concerned.

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Simon Ward
On Wed, Jan 28, 2009 at 12:30:01AM +, Tom Hughes wrote:
 In practice keys are unique because although the API has never enforced 
 uniqueness pretty much every client does because all the clients use a 
 hash table of some sort to store tags.

Hash table, or associative array/hash/dictionary?  Hash tables have
mechanisms to deal with collisions.  I suppose I should just look at the
code…

Simon
-- 
A complex system that works is invariably found to have evolved from a
simple system that works.—John Gall


signature.asc
Description: Digital signature
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Stefan de Konink
Tom Hughes wrote:
 In practice keys are unique because although the API has never enforced 
 uniqueness pretty much every client does because all the clients use a 
 hash table of some sort to store tags.

Except for editors like Potlatch that use intermediate layers to access 
the API and screw up the uniqueness. But all that will be solved soon :)


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Frederik Ramm
Hi,

Simon Ward wrote:
 Hash table, or associative array/hash/dictionary?  Hash tables have
 mechanisms to deal with collisions.  

A collision in a hash table is two keys sharing the same hash value, not 
two keys being identical.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Richard Fairhurst

Stefan de Konink wrote:
 Except for editors like Potlatch that use intermediate layers to 
 access the API and screw up the uniqueness.

In what way does Potlatch screw up the uniqueness, please?

Richard
-- 
View this message in context: 
http://www.nabble.com/Re%3A--OSM-talk--Handling-of-towns-with-different-or-alternative%09names-tp21697397p21698207.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Stefan de Konink
Richard Fairhurst wrote:
 Stefan de Konink wrote:
 Except for editors like Potlatch that use intermediate layers to 
 access the API and screw up the uniqueness.
 
 In what way does Potlatch screw up the uniqueness, please?

http://api.openstreetmap.org/api/0.5/way/24644162/history


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Richard Fairhurst

Stefan de Konink wrote:
Richard Fairhurst wrote:
 In what way does Potlatch screw up the uniqueness, please?
 http://api.openstreetmap.org/api/0.5/way/24644162/history

does not even begin to be an answer. It could be caused by the bleeding
phase of the moon for all that tells me. I was hoping, perhaps
optimistically, you might actually back up your accusation with some
_research_.

Richard
-- 
View this message in context: 
http://www.nabble.com/Re%3A--OSM-talk--Handling-of-towns-with-different-or-alternative%09names-tp21697397p21698360.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Richard Fairhurst

Stefan de Konink wrote:
Richard Fairhurst wrote:
 In what way does Potlatch screw up the uniqueness, please?
 http://api.openstreetmap.org/api/0.5/way/24644162/history

does not even begin to be an answer. It could be caused by the bleeding
phase of the moon for all that tells me. I was hoping, perhaps
optimistically, you might actually back up your accusation with some
_research_.

Richard
-- 
View this message in context: 
http://www.nabble.com/Re%3A--OSM-talk--Handling-of-towns-with-different-or-alternative%09names-tp21697397p21698361.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] [OSM-talk] Handling of towns with different or alternative names

2009-01-27 Thread Stefan de Konink
Richard Fairhurst wrote:
 Stefan de Konink wrote:
 Richard Fairhurst wrote:
 In what way does Potlatch screw up the uniqueness, please?
 http://api.openstreetmap.org/api/0.5/way/24644162/history
 
 does not even begin to be an answer. It could be caused by the bleeding
 phase of the moon for all that tells me. I was hoping, perhaps
 optimistically, you might actually back up your accusation with some
 _research_.

Hey; I am just fixing the mess I see when I import a planet, everything 
that breaks has in planet duplicate nodes, with created by Potlatch 
tags. And if you look in the history you even see better what goes wrong.


Update way process:

--- create way in mysql
--- add nodes to way
--- add tags to way


Now how does this thing actually go?

--- add tags to existing way
--- add nodes to existing way
--- create way in mysql

This results to the 'old' way to have extra tags, where there *IS* 
deduplication on node members, the new way have no body. And a badass 
export that only exports the way with body to the planet. Because of 
some interesting diff way I guess that ignores empty ways?


What could possibly have caused this? My bet:

that use intermediate layers to access the API



Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev