So, [EMAIL PROTECTED] uses a database to store blank tiles. This database is the primary thing that slows down tileset processing at this point. The database contents look like this:
mysql> select z,layer,count(*) from tiles_blank group by z,layer; +----+-------+----------+ | z | layer | count(*) | +----+-------+----------+ | 0 | 1 | 1 | | 2 | 1 | 3 | | 3 | 1 | 9 | | 4 | 1 | 38 | | 5 | 1 | 156 | | 6 | 1 | 503 | | 7 | 1 | 1340 | | 8 | 1 | 819 | | 9 | 1 | 3453 | | 10 | 1 | 10040 | | 11 | 1 | 29195 | | 12 | 1 | 153465 | | 12 | 3 | 192977 | | 12 | 5 | 2 | | 13 | 1 | 202329 | | 13 | 3 | 206492 | | 14 | 1 | 625637 | | 14 | 3 | 616067 | | 15 | 1 | 2017676 | | 15 | 3 | 1810050 | | 16 | 1 | 6378318 | | 16 | 3 | 5060867 | | 17 | 1 | 18109393 | +----+-------+----------+ 23 rows in set (5 min 0.66 sec) Total, there are 35million rows. Since low content tilesets may result in many hundreds of selects / "REPLACE INTO"s, this process is the primary slow part of the tileset processing. Over the several months that the server has been running on Hypercube, the number of tilsets that the server ca process per hour has dropped significantly due to the growth in size of this table. Additionally, you can see from the munin graphs that: http://munin.openstreetmap.org/openstreetmap/tah.openstreetmap.html When the size per tilset is smaller, the processing rate is much slower: http://munin.openstreetmap.org/openstreetmap/tah.openstreetmap-tah_bytes.html You can see here a huge difference between the processing of mostly full tilesets -- the big peaks are generally when processing tilsets requested via the changed tiles script -- versus processing of mostly empty tilesets from the low priority queue. There are a couple problems here: 1. tilesets are processed in the order they are uploaded -- so when the tilset queue is full of low priority requests, these are much more slow to actually get through the processing queue, meaning that although the more full tilesets are there, people still have to wait a long time for them. 2. This doesn't seem sustainable: I understand we're getting through the tile queues... but, the number of blank tiles stored in the DB is only going to continue to increase, and as it increases, the number of SQL statements that can be processed is simply going to go down. In the short term, making the processing take priority into account -- so that higher priority tilesets are processed before lower priority tilesets -- seems important to me, since that's the thing that affects people the most. IF they have to wait 3 hours for all the low priority crap to clear through in order to see the tiles they just uploaded at priority 1, that's clearly a bad feedback loop. In the long term, we need to come up with some more efficient way of storing blank tiles than the current database. I don't know what this means: perhaps it's running under some db format (innodb?) that we can stop using in favor of a less robust mysql table type? Perhaps we can explore some other mechanism of storing this information? Perhaps the code really just does too many selects/inserts and can be cleaned up? I don't know the code well enough to say -- but I have straced the processes and assured myself that the slowdowns we are seeing are simply the result of the much larger blank tile db, and that we need to do something about it if we want [EMAIL PROTECTED] performance to increase. For the record, I ran the cleanblanktiles script yesterday, so the blank tile db is as clean as that code makes it. I don't know if there are further optimizations that can be made there -- perhaps someone else can comment -- but I've done everything I know how to do. Regards, -- Christopher Schmidt MetaCarta _______________________________________________ Tilesathome mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/tilesathome
