Today my minutely replication started failing with a unique constraint violation error from postgres. Upon further investigation I found that there were *already* two copies of a way in my database. An incoming change was trying to modify the way which caused postgres to notice the duplication and error out. Basically a "hey wait there are two of them. Which one do you want me to modify?" Here is the osmosis output:
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "pk_ways" Detail: Key (id)=(26926573) already exists. It was erroring on this way: http://www.openstreetmap.org/browse/way/26926573/history So a few questions immediately come to mind. 1) How did a duplicate record get into the database? There is definitely a primary key constraint on the id column. In this particular case it looks like it happened during the initial planet import. I did this from the January 2nd pbf file. The two rows are identical in every way and the way was last touched (before today's edit) in 2009. All constraints are disabled during the \copy operation so I can see a duplicate way being able to get in. Although this implies that there are either two copies of the way in the planet file or a bug in osmosis. I would have thought the primary key constraint would have been checked when it was recreated after the \copy operation though. Apparently not. 2) How do I fix this? I believe deleting one of the rows would fix this but I can't actually delete only one since *every* column is the same. I think it was suggested on #osm-dev that I create a copy of one in temp table, delete both and then reinsert the copy. This is probably what I will try. 3) Are there any others? Turns out: yes, there are 4 duplicated ways in my database. This may not come through with good formatting but here they are: id | version | user_id | tstamp ----------+---------+---------+--------------------- 26245218 | 12 | 163673 | 2011-02-06 06:54:10 26245218 | 13 | 290680 | 2013-01-28 02:37:56 26709186 | 4 | 64721 | 2008-09-02 04:39:21 26709186 | 4 | 64721 | 2008-09-02 04:39:21 26709284 | 4 | 70621 | 2008-10-26 14:06:03 26709284 | 5 | 64721 | 2013-01-28 02:38:30 26926573 | 4 | 118011 | 2009-12-27 07:13:28 26926573 | 4 | 118011 | 2009-12-27 07:13:28 A couple of interesting things here. - Two of them have identical duplicates (26709186 and 26926573). These can both be explained by an error in the planet file or import process. - The other two however are not the same and both of them must have been created during diff application because it happened 2 days ago - within 10 seconds of each other. It is possible that there were duplicates of these ways as well and for some reason they just didn't hit this error during diff application and one of the records was successfully updated. Soo... wtf? Does Does anyone have ideas about how postgres' primary key check could be circumvented? Is my theory about the \copy getting around it during import feasible? But what about the ones created during diff processing? Looking at my system monitoring I don't see anything unusual going on 2 days ago. I've been having problems with X on this machine but that won't affect postgres and osmosis is running inside of screen. Soo... yeah. Anything? :) Toby _______________________________________________ osmosis-dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/osmosis-dev
