That sort of calculation works well for pre-planned bulk data like bulk loads, synchs, etc as you can rate-limit them so as not to swamp other activities (and generally scheduled when user demand is low, i.e. excess capacity is high). But for the end-user you have to model for peak demand/flow. Unless you can control user behaviour somehow (or are modulating your capacity for it); it doesn't become as important that you have 500 users at 4pm if your peak load is 2500 users at 8am - since you need to support the 2500 (which obviously subsumes the 500 at 4), etc. Knowing your average daily/weekly total bandwidth consumption is important if your connection is metered, but it's not really a capacity issue per se.
Since the tiles would already exist it becomes a fairly generic problem of n image requests per second (n simultaneous GET requests basically). Image size is roughly a function of format (jpeg/png8/png24) and tile size (256x256, 512x512, or w/e). "Empty" (ocean) tiles should not really be included as people generally don't pan across the ocean zoomed in so including them in your average tile size will skew results down. If you're really bw-conscious you can make those empty tiles really just be links to 1 image so it gets cached anyway. What the n is (number of concurrent requests), well I have no idea. If you want to translate the peak raw simultaneous requests to a rough estimate of concurrent users then figure out the average lag time between requests and multiply that out. So for example if you're serving 10kb tiles and want to support 1000 simultaneous requests with the mean 'user wait time' being 5 seconds then you could support 50,000 simultaneous users with roughly 400 megabits. You also have to add in some overhead for non-data connections but based on our own servers the tiles end up dominating that significantly to the point where you can probably just add a few percent scalar if you really care about it - since the tiles are generally larger than 10k (that was just easy to do math with).. That's my 2am napkin version; if anyone has a good idea of how many users to plan for I can pull out my traffic engineering books and do a better analysis... regards, - bri On Thu, Dec 10, 2009 at 4:41 PM, Stephen Woodbridge <wood...@swoodbridge.com > wrote: > Richard Weait wrote: > >> So what is a reasonable guess for how much bandwidth we would like for >> one of our initial storage nodes? >> >> How does this change if we add an initial tile server / tile cache node? >> >> How does this change if we add an initial catalog server? >> >> I know that when we talk about the bandwidth that we can consume the >> best answer is "All of it" but I don't really want to put that in a >> proposal. >> > > I think one way to approach this is to try and quantify some assumptions > about usage. So building a simple usage model will probably help and then > people can debate the model and whatever people agree to from a model point > of view will dictate the bandwidth. > > So something like this should get you started: > > o types of services: > > Catalog queries > WMS images from something like OpenLayers > Bulk data uploads, bulk data downloads > other transactions > > o number of users for each each service per time period > o avg number of transactions per user per service > o avg size of transactions > > o other exceptional transactions like site morroring > - how much?, how often? > > > etc, etc. > > Then do the math, or put it in a spreadsheet and others can tweak the > numbers as they see fit. This will drive the discussion into concrete > assumptions that people need to discuss and understand and agree upon and it > will give you an initial number to start working with. > > -Steve W > > > _______________________________________________ > talk mailing list > talk@openaerialmap.org > http://openaerialmap.org/mailman/listinfo/talk_openaerialmap.org >
_______________________________________________ talk mailing list talk@openaerialmap.org http://openaerialmap.org/mailman/listinfo/talk_openaerialmap.org