It seems that I still have the January 2nd planet file sitting around. I just tried to regenerate the load files from it. I ran out of drive space before it finished but it did make it up to way ID 123 million or so. All of the problem ways have IDs in the 26 million range so unless there is some duplication at the very end of the planet file, it seems unlikely that this is where the problem is. Also, Paul checked his database and didn't find any duplicates. While he didn't user the same planet file, these ways are old enough that he would be affected too if there really was a problem in the planet file itself. I will try making a file without the linestrings which should be small enough to fit on my remaining space, just to be sure.
But this leaves postgres and/or the diff application process. It seems like there is definitely a problem in postgres because it never should have let these records be created in the first place. But I'm trying to come up with some scenario that, even ignoring the primary key constraint, would duplicate a way that haven't been touched in years as is the case with way 26709186. I've got nothing. It seems like postgres just randomly decided that I needed more ways in my life. I did spend a while talking with someone who has some serious postgres knowledge in #postgresql tonight and because of some of the details of how the index is structured he thinks there is definitely some bug and told me to send an email to their bugs mailing list. I guess I'll see what happens there. Toby On Wed, Jan 30, 2013 at 7:47 PM, Toby Murray <[email protected]> wrote: > Today my minutely replication started failing with a unique constraint > violation error from postgres. Upon further investigation I found that > there were *already* two copies of a way in my database. An incoming > change was trying to modify the way which caused postgres to notice > the duplication and error out. Basically a "hey wait there are two of > them. Which one do you want me to modify?" Here is the osmosis output: > > Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key > value violates unique constraint "pk_ways" > Detail: Key (id)=(26926573) already exists. > > It was erroring on this way: > http://www.openstreetmap.org/browse/way/26926573/history > > So a few questions immediately come to mind. > > 1) How did a duplicate record get into the database? There is > definitely a primary key constraint on the id column. In this > particular case it looks like it happened during the initial planet > import. I did this from the January 2nd pbf file. The two rows are > identical in every way and the way was last touched (before today's > edit) in 2009. All constraints are disabled during the \copy operation > so I can see a duplicate way being able to get in. Although this > implies that there are either two copies of the way in the planet file > or a bug in osmosis. I would have thought the primary key constraint > would have been checked when it was recreated after the \copy > operation though. Apparently not. > > 2) How do I fix this? I believe deleting one of the rows would fix > this but I can't actually delete only one since *every* column is the > same. I think it was suggested on #osm-dev that I create a copy of one > in temp table, delete both and then reinsert the copy. This is > probably what I will try. > > 3) Are there any others? Turns out: yes, there are 4 duplicated ways > in my database. This may not come through with good formatting but > here they are: > id | version | user_id | tstamp > ----------+---------+---------+--------------------- > 26245218 | 12 | 163673 | 2011-02-06 06:54:10 > 26245218 | 13 | 290680 | 2013-01-28 02:37:56 > 26709186 | 4 | 64721 | 2008-09-02 04:39:21 > 26709186 | 4 | 64721 | 2008-09-02 04:39:21 > 26709284 | 4 | 70621 | 2008-10-26 14:06:03 > 26709284 | 5 | 64721 | 2013-01-28 02:38:30 > 26926573 | 4 | 118011 | 2009-12-27 07:13:28 > 26926573 | 4 | 118011 | 2009-12-27 07:13:28 > > A couple of interesting things here. > - Two of them have identical duplicates (26709186 and 26926573). These > can both be explained by an error in the planet file or import > process. > - The other two however are not the same and both of them must have > been created during diff application because it happened 2 days ago - > within 10 seconds of each other. It is possible that there were > duplicates of these ways as well and for some reason they just didn't > hit this error during diff application and one of the records was > successfully updated. > > Soo... wtf? Does Does anyone have ideas about how postgres' primary > key check could be circumvented? Is my theory about the \copy getting > around it during import feasible? But what about the ones created > during diff processing? Looking at my system monitoring I don't see > anything unusual going on 2 days ago. I've been having problems with X > on this machine but that won't affect postgres and osmosis is running > inside of screen. Soo... yeah. Anything? :) > > Toby _______________________________________________ osmosis-dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/osmosis-dev
