Obviously i am a novice when it comes to raster stuff but this sounds
reasonable. It sounds like the approach is to play it safe by imposing some
reasonable limits based on heuristics, but at the same time leaving the
admin the ability to change or forgo them in cases where they have ensured
the data is properly set up for streaming/tiling.
On Tue, Sep 21, 2010 at 8:04 AM, Andrea Aime <[email protected]> wrote:
> Hi,
> this week I'm going to try and impleemnt some WCS limits
> for people that want to serve big amounts of raster data
> without the headache of a user hitting the server hard with
> a very large request.
>
> Generally speaking we want to avoid that:
> - the request ends up reading too much data (e.g., we don't
> want a user to make a request that will make the server
> read 10TB of data)
> - the request ends up using too much memory
> - the request ends up generating a too large response (e.g.,
> again, we don't want the server to generate a 10GB response)
>
> Providing a general solution to the problem can be very complex:
> * a tiled data source may allow to streaming read by tiles,
> allowing for small memory usage while still reading a
> truckload of data
> * a tiled output format can ensure nowhere in the chain
> the whole image is composed in memory
> * however, a non tiled input or a non tiled output will at
> some point make the image be built fully in memory
> * where and when is difficult to say, e.g., we might read
> a small amount of data from the input and then build a
> huge raster in memory because the user is supersampling
> (asking a higher than native resolution) and the output
> format is not tile enabled
> * the MB read during input and output are again difficult
> to control because of format and compression differences
> * the WCS right now does not use overviews, but it might in
> the future
> Long story short, trying to control the actual amount of data
> read, kept in memory and generated in output is beyond our
> reach.
>
> What I'm going to propose is a simplified compromise based
> on a worst case scenario.
> We allow the administrator to setup a maximum of MB to be
> read, and a maximum MB to be generated in output.
> The measure in MB is computed as an equivalent single tile,
> uncompressed situation (assuming everything has to be
> read or generated in one shot):
> width * height * bands * band_size
>
> This simplifying assumption ensures we are not going to
> ever have more than the limits be read, kept in memory
> or generated: normally (hopefully) we'll actually have
> less.
>
> The distinction between input and output is there because
> normally WCS request perform some resampling and generate
> inputs at a resolution different than the outputs, and
> in some setups the admin can play on the difference to
> relax at least the input limits.
> In particular, if the admin can ensure all rasterv sources
> are tiled, it's possible to relax the input limits as
> the tiled sources will never load the full data in memory,
> and the WCS processing chain ensures that at worst the
> data is recomposed in a single tile if the output format
> cannot deal with inner tiling.
> Even in a setup where all the sources are tiled and the output
> format list has been somehow modified to allow only tiling formats
> (atm, only geotiff) it would still be good to setup some
> (large) limits to avoid disk or network flooding for long amounts
> of time.
>
> Let's make an example. If we set a 200MB of limit as input,
> and 20MB of limit as output, then following will be considered
> valid:
> * a request that makes GS read a 14481x14481 portion of a
> 8bit single band raster data
> * a request that makes GS read a 7240x7240 portion of a RGBA
> (or other 4 band) image
> * a request that makes GS generate a 4579*4579 8bit raster in
> output
> * a request that makes GS generate a 2290*2290 4byte raster in
> output (RGBA, or a single band, floating point, double precision
> one).
>
> This should give the administrator some control and safety without
> forcing us to consider all the possibilities of formats, compressions,
> and tiling arrangements. Using equivalent MB also summarizes well the
> many possible setups of width, height, bands and band sizes in a single
> number that can be setup WCS wide and provide peace of mind to the
> administrator (and stability to the server)
>
> Opinions?
>
> Cheers
> Andrea
>
> --
> Andrea Aime
> OpenGeo - http://opengeo.org
> Expert service straight from the developers.
>
>
> ------------------------------------------------------------------------------
> Start uncovering the many advantages of virtual appliances
> and start using them to simplify application deployment and
> accelerate your shift to cloud computing.
> http://p.sf.net/sfu/novell-sfdev2dev
> _______________________________________________
> Geoserver-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>
--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.
------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel