On 01/-10/-28163 08:59 PM, Michal Migurski wrote: > > On Mar 5, 2010, at 11:34 AM, John Smith wrote: > >> On 6 March 2010 01:24, Bernhard zwischenbrugger<[email protected]> wrote: >>> Google Cache Time: >>> Cache-Control: public, max-age"222222 //feels like one month (I >>> didn't calculate) >> >> I'd say it's a bad idea to specify a cache time, instead there is >> other caching mechanisms to tell if a tile has changed: >> >>> ETag: "d096ddafba32c0da609007e224530ccd" >> >> This way if a tile never changes you never need to refresh. > > > For what it's worth, the current tile server does specify a cache time as > well as an ETag. > > % curl -sI "http://tile.openstreetmap.org/14/2627/6331.png" > HTTP/1.1 200 OK > Date: Sun, 07 Mar 2010 02:19:30 GMT > Server: Apache/2.2.8 (Ubuntu) > ETag: "93087c5713c17d9939cac9e341fdd14c" > Content-Length: 26595 > Cache-Control: max-age36 > Expires: Sun, 07 Mar 2010 02:36:46 GMT > Content-Type: image/png > > 1,000 sec. max age there is a little over 15 minutes, though when I repeat > this request I get expiry times all over the place, from a few minutes to > many hours. What currently decides on the cache expiration time?
mod_tile, the apache module used to server the tiles, has a fairly sophisticated mechanism to decided the expiry times, driven by a bunch of heuristics. As with the minutely rendering, we don't have a periodic update cycle anymore, there is no real good way of setting the expiry times, as one would need to guess when in the future this tile might change. As that is obviously not possible, we need to trade off between caching time (reducing server resources and client side latency) and up-to-dateness to not loose the benefits of the minutely updates. The heuristics currently supported (and used) are the following. At a first instance it decides if the tile is known to be "dirty" i.e. outdated. If the tile server is overloaded, or the rendering takes longer than 3 seconds, mod_tile will serve an old tile rather than wait until the on-the-fly rendering will finish. (Again a trade-off between client side latency and up-to-dateness) At that point, given that we know the tile will soon change, the max-age cache parameter is set very low. 15 minutes + a 7 minute random jitter. If the tile served is not stale, there are another 3 heuristics A zoom level based heuristic, a last modified heuristic and a known planet update cycle if it exists. The zoom level based heuristic allows to set the minimum max-age caching time based on if the tile served is a low zoom, medium zoom or high zoom tile. The idea behind this is that low zoom tiles (even though they are effected by all changes) don't appear to change much. Thus it seems reasonable to allow clients to cache these much longer as the effect of a stale tile from cache is probably less. The current setup of tile.osm.org, I think, doesn't use this heuristic though and setts the minimum max-age caching to 3 hours + 3 hours random jitter for all zoom levels, even though the minutely tile expiry doesn't actually expire low zoom tiles and thus only change if manually requested. So I think it would be good to increase the time to cache low zoom tiles, as in the current setup it shouldn't affect things negatively. The last modified heuristic tries to guess how likely it is for a tile to change. E.g. a tile in the middle of the pacific is probably not going to change anytime soon. So it wouldn't matter to give e.g. a max-age of a week. A tile perhaps in central Berlin is more likely to change. So the heuristic guesses how likely it is to change in the future based on how long it has been since it last changed. It then specifies a linear scaling of max-age to last modified time with a tunable slope parameter. As it is fairly unclear how well this heuristic works, I believe the osm tile server still has this at its default, i.e. turned off completely. The last "heuristic", is that based on planet update cycles. For those servers that have a planet update cycle (i.e. not tile.osm.org), you don't have to guess and can just set the expiry time to when the next update cycle begins. This is the most efficient from a caching point of view, but doesn't work with minutely updates. The final max-age handed out by the server for clean tiles is then the maximum time of any of the 3 heuristics capped to a week. The random jitter factor is there mostly for if you have weekly update cycles, to not expire all tiles at exactly the same time and then overwhelm your tile server when suddenly all cached tiles expire. Since a couple of hours, the mod_tile code would now also support a tile expiry based on hostname header, so it would theoretically be possible to do something like cache.tile.osm.org handing out expiry headers of e.g. a month. But it isn't clear how one would decided who to send to a hypothetical cache.tile and who to the normal tile server. It is also not clear what it would do to osmf's own (currently still relatively limited) caching, as it would now require two copies of each tile being kept by the accelerator caches, doupling the required resources. So I am not sure if or in what form this would potentially happen, even though I do think it is a good idea from the client perspective. Cutting it short, the current tile.osm.org server basically hands out expiry times of 15-22 minutes for stale tiles and 3 - 6 hours for clean tiles with a bunch of more parameters that could be tuned. > > The Phnom Penh issue all sounds like a job for a CDN like Akamai's or a > caching proxy (i.e. squid-cache.org) closer to Cambodia. Bernhard, these are > not difficult to set up for yourself if you are interested, and require > little knowledge of the actual map. Having a CDN would definitely help and would probably be indeed the preferred option in this specific case. But it would require osmf having hosting facilities in various countries. Great, if it were possible, but I am not sure if it is at the moment. Since a few days, there is a trial to see how well a CDN / caching proxy would work in our setup with a.tile.osm.org redirecting to a simple proxy server at a different hoster (although in London, too). It is too early to say much yet, but it does seem like the cache hit ratios are lower than I would have hoped them to be with only about 40 - 60% of request successfully being served by the proxy without needing to contact the main server. ( http://munin.openstreetmap.org/openstreetmap/konqi.openstreetmap.html#Squid for reference ) We will need to see how this all pans out, but I would guess it will depend on resources donated to osmf to make some of this happen and ensure that the tile serving infrastructure can be expanded in the future. Kai > > -mike. > > ---------------------------------------------------------------- > michal migurski- [email protected] > 415.558.1610 > > > > > _______________________________________________ talk mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk

