Hi, this week I'm going to try and impleemnt some WCS limits for people that want to serve big amounts of raster data without the headache of a user hitting the server hard with a very large request.
Generally speaking we want to avoid that: - the request ends up reading too much data (e.g., we don't want a user to make a request that will make the server read 10TB of data) - the request ends up using too much memory - the request ends up generating a too large response (e.g., again, we don't want the server to generate a 10GB response) Providing a general solution to the problem can be very complex: * a tiled data source may allow to streaming read by tiles, allowing for small memory usage while still reading a truckload of data * a tiled output format can ensure nowhere in the chain the whole image is composed in memory * however, a non tiled input or a non tiled output will at some point make the image be built fully in memory * where and when is difficult to say, e.g., we might read a small amount of data from the input and then build a huge raster in memory because the user is supersampling (asking a higher than native resolution) and the output format is not tile enabled * the MB read during input and output are again difficult to control because of format and compression differences * the WCS right now does not use overviews, but it might in the future Long story short, trying to control the actual amount of data read, kept in memory and generated in output is beyond our reach. What I'm going to propose is a simplified compromise based on a worst case scenario. We allow the administrator to setup a maximum of MB to be read, and a maximum MB to be generated in output. The measure in MB is computed as an equivalent single tile, uncompressed situation (assuming everything has to be read or generated in one shot): width * height * bands * band_size This simplifying assumption ensures we are not going to ever have more than the limits be read, kept in memory or generated: normally (hopefully) we'll actually have less. The distinction between input and output is there because normally WCS request perform some resampling and generate inputs at a resolution different than the outputs, and in some setups the admin can play on the difference to relax at least the input limits. In particular, if the admin can ensure all rasterv sources are tiled, it's possible to relax the input limits as the tiled sources will never load the full data in memory, and the WCS processing chain ensures that at worst the data is recomposed in a single tile if the output format cannot deal with inner tiling. Even in a setup where all the sources are tiled and the output format list has been somehow modified to allow only tiling formats (atm, only geotiff) it would still be good to setup some (large) limits to avoid disk or network flooding for long amounts of time. Let's make an example. If we set a 200MB of limit as input, and 20MB of limit as output, then following will be considered valid: * a request that makes GS read a 14481x14481 portion of a 8bit single band raster data * a request that makes GS read a 7240x7240 portion of a RGBA (or other 4 band) image * a request that makes GS generate a 4579*4579 8bit raster in output * a request that makes GS generate a 2290*2290 4byte raster in output (RGBA, or a single band, floating point, double precision one). This should give the administrator some control and safety without forcing us to consider all the possibilities of formats, compressions, and tiling arrangements. Using equivalent MB also summarizes well the many possible setups of width, height, bands and band sizes in a single number that can be setup WCS wide and provide peace of mind to the administrator (and stability to the server) Opinions? Cheers Andrea -- Andrea Aime OpenGeo - http://opengeo.org Expert service straight from the developers. ------------------------------------------------------------------------------ Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev _______________________________________________ Geoserver-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/geoserver-devel
