Once again: Thanks for all your work on this! After taking a stab at it myself I certainly have a new appreciation for what you've done.
> The "history" diffs are in the process of being generated and are well > through 2008 as we speak. These are effectively daily diffs but aren't > getting deleted on a rolling window basis. This is effectively creating a > full history dump of the database. This has been in the wings for a while, > but only possible now that there is some more disk space available. These > are still timestamp based extracts due to transaction id queries being > useless for historical queries. As a result of the use of timestamps, these > will be run with a large delay to avoid missing data. I'll probably set > this delay to 1 day to be safe, but perhaps a couple of hours would be > enough. The first few years worth of history diffs have been created using the "old" Osmosis version. So is it possible that they are missing a few transactions too? (As a result of the "one off" bug). > Moving away from a file-based > distribution approach has serious implications for reliability in the face > of server and network outages, cacheability, bandwidth consumption, and > server resource usage. As a result, the existing approach is likely to > represent the state of the art in the near to medium future. We need to > stabilise the existing features before attempting new ones :-) I thought about a replication over pubsubhubbub which should take care of bandwith, cacheability, server resource usage (with fat pings) and a few other problems. But I've done no work on it yet or even thought it through. It just seemed like a fitting concept for the (or _one_ of the types) type of replication we need. The MusicBrainz project is facing much the same problem as we are and they're using a very similar solution (http://musicbrainz.org/doc/Replication_Mechanics). Cheers, Lars _______________________________________________ osmosis-dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/osmosis-dev
