On Sun, Jun 01, 2008 at 10:00:22AM +0200, Gerhard Schmidt wrote:
> 1. A Client requests a tile from [EMAIL PROTECTED]
> 
> One of the Reflector server (choose per round robin or by some
> intelligent code in the OSM Javascript code) gets the request looks the
> tile up in the Database and returns the location of the tile via
> redirect the the client who connects to the indicated TileServer and
> gets the Tile there.
>
>
> 2. A [EMAIL PROTECTED] client uploads an Tile package
> 
> The [EMAIL PROTECTED] client requests an upload from an web service at one of 
> the
> Database servers. The web service responds with an upload cookie and the
> address of one of the TileServer which is able to store more tiles.
> 
> The [EMAIL PROTECTED] client starts the upload of his tile to this server and 
> goes on
> with the next render request.
> 
> The TileServer extracts the tiles in a new directory on the server and
> updates the location for the uploaded tiles in the Main Database.
> 
> Each TileServer runs a garbage collectiont at some interval to find
> tiles that are no longer referenced in the Database an delete them.

Every request for a tile going through a database sounds horrible from
the point of speed in my eyes.

I'd wish the redirector could easily determin the location by e.g.
hashing the tiles x and y taking the first 2 bytes e.g. 2^16 chunks
of data and deciding which tileserver contains this chunk of tiles the
mechanism of this could even be embedded into the map javascript code
so no need for a redirector.

So in the end you split 2^16 tiles to n machines - you now only need a
map with 2^16 entries describing where to find this location. The point
in using a hash is a better load and data distribution or even much more
hackish - put the map into the DNS e.g.

th0000.ts.informationfreeway.org
th0001.ts.informationfreeway.org
...
thffff.ts.informationfreeway.org

Now requesting the tile is made by the javascript code by running e.g.
an MD5 or even cheaper hash over the tiles x and y and using the data
to construct the url. Adding more machines would mean put them into the
cluster, deciding which chunks of data to move, copying the data and
once done switch the DNS over to the new machine. Adding more power
to a single chunk could be done with multiple ip addresses to the dns
entry. DNS is most likely one of the oldest distributed data systems
which has proven to work very reliable.

Uploading would work as currently established except the upload server
only pushing the tiles out to the appropriate tileserver(s) noted in the
hash map.

From what i understood the bootleneck is currently disk i/o or better
metatadata io as the files get unpacked. This could either be solved
by distributing on different filesystems or even on different machines.

I dont think fixing the upload queue problem by loading it up to a more
or less random tileserver and letting the frontend fix the location
problem is a long term dead end. Given the success of OSM in the long
term will shift load from updating to serving tiles whereas the latter
will get more expensive with this approach.

Flo
-- 
Florian Lohoff                  [EMAIL PROTECTED]             +49-171-2280134
        Those who would give up a little freedom to get a little 
          security shall soon have neither - Benjamin Franklin

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Tilesathome mailing list
[email protected]
http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/tilesathome

Reply via email to