Frederik Ramm wrote: > Hi, > > Brett Henderson wrote: >> Unfortunately the minute diffs appear to be regularly missing data. >> In the last 8 hours at least 3 changesets have been missed. The ones >> I've noticed are 1076325, 1076998, 1077469. These have been detected >> by comparing the normal minute diffs against another minute diff >> process running half an hour later. > > Can you elaborate a bit? I don't quite understand what you mean by > changesets that have been missed. What exactly are you doing, and in > what way do the results look wrong? Okay, that probably wasn't clear. Osmosis doesn't even look at changesets at the moment. So when I talk about changesets I'm not specifically referring to any of the data in the changeset table. The only thing I look at is the changeset_id column on entities which results in the "changeset" attribute.
What I've noticed is that when I run minute diffs 5 minutes behind the API, data is missed compared to minute diffs running 30 minutes behind the API. Each time data is missing I've noticed that it belongs to a single large changeset. Presumably this is because the large changesets sometimes take longer than 5 minutes to process (seems awfully slow but that's what I'm seeing) and therefore the database transaction is not committed until 5 minutes after the start of processing. This 5 minute delay means that by the time data is committed and becomes visible to osmosis querying the history table, osmosis has moved past the time window containing that data and the changes are missed. > > - Are you sure that we're all on the same page regarding the meaning > of changeset columns in the database, especially that the "closed_at" > date is only fixed once it is in the past - as long as "closed_at" is > in the future, it can still move forward or backward in time. (I'm not > even sure I am right on this one but I trust I'll be told by someone > if not ;-) I'm not reading any of the changeset table data so the behaviour of the closed_at field doesn't affect osmosis. The changeset table is effectively useless to osmosis processing because changesets aren't atomic. At some point I'd like to replicate its contents but I will have to trigger that off the timestamps on the entities within it (meaning the changeset metadata may be replicated several times) to get accurate results. I hope that answers your questions. Brett _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

