Fixed the ST_Intersects not permitting out-db bands in r10590 for 2.0.2. On Tue, Oct 30, 2012 at 7:33 AM, Bborie Park <bkp...@ucdavis.edu> wrote: > Thanks for the version info. I'll have to poke around. > > Nope. No other limitations to out-db. > > -bborie > > On Mon, Oct 29, 2012 at 5:25 PM, James Hiebert <hieb...@uvic.ca> wrote: >> => select version(), postgis_full_version(), postgis_raster_lib_version(); >> >> PostgreSQL 9.1.5 on x86_64-pc-linux-gnu, compiled by x86_64-linux-gnu-gcc >> (Gentoo 4.4.6-r1 p1.0, pie-0.4.5) 4.4.6, 64-bit | POSTGIS="2.0.1 r9979" >> GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.1, >> released 2012/05/15" LIBXML="2.8.0" LIBJSON="UNKNOWN" (core procs from >> "2.0.0 r9605" need upgrade) RASTER (raster procs from "2.0.0 r9605" need >> upgrade) | 2.0.1 r9979 >> >>> out-db rasters does have the limitation that they are read-only. >> >> Good to know; shouldn't be a problem for us as model output is fundamentally >> immutable. Any other limitations that I should be aware of? >> >> ~James >> >> On Mon, Oct 29, 2012 at 05:05:03PM -0700, Bborie Park wrote: >>> Wow. What version of PostGIS are you running? >>> >>> Great to hear that the out-db works for you. I always expected that >>> out-db would work better for rasters with large numbers of bands. >>> out-db rasters does have the limitation that they are read-only. >>> >>> -bborie >>> >>> On 10/29/2012 05:02 PM, James Hiebert wrote: >>> >> I believe ST_Intersects() works on out-of-db rasters in the 2.0 series, >>> >> possibly 2.0.1. >>> > >>> > Hmmm, for me it it fails for the (raster, integer, geometry) signature: >>> > >>> > raster_test=> SELECT rid FROM basins INNER JOIN bcsd ON >>> > ST_Intersects(rast, 1, the_geom) WHERE rid = 39; >>> > ERROR: rt_raster_intersects not implemented yet for OFFDB bands >>> > CONTEXT: PL/pgSQL function "_st_intersects" line 20 at RETURN >>> > >>> > but it appears that you're right for the (geometry, raster, integer) >>> > signature: >>> > >>> > raster_test=> SELECT rid FROM basins INNER JOIN bcsd ON >>> > ST_Intersects(the_geom, rast, 1) WHERE rid = 39; >>> > rid >>> > ----- >>> > 39 >>> > (1 row) >>> > >>> >> I wonder what your benchmark's performance would be like if the raster >>> >> is out-db. I'd expect a flat line with little change regardless the # >>> >> of bands. >>> > >>> > Ah ha! Yes, that's definitely the case. With out of db storage, each of >>> > intersects/clip queries comes back in < 200ms, regardless of num bands. >>> > That's more of the behaviour that I was expecting, too. Thanks for >>> > helping me put a finger on it! >>> > >>> > ~James >>> > >>> > On Mon, Oct 29, 2012 at 04:33:36PM -0700, Bborie Park wrote: >>> >> I believe ST_Intersects() works on out-of-db rasters in the 2.0 series, >>> >> possibly 2.0.1. >>> >> >>> >> As for performance of in-db vs out-db, in-db is slightly faster but my >>> >> benchmarks are rather old. I hope to do some testing soon to see if I >>> >> can improve out-db performance. >>> >> >>> >> Tile size is critical regardless of whether or not you're going to store >>> >> your rasters in-db or out-db. Generally, tiles should be 100x100 or >>> >> smaller. Ideal tile size depends upon the input raster's dimensions and >>> >> what tile dimension is cleanly divisible from the raster's dimension. >>> >> >>> >> I wonder what your benchmark's performance would be like if the raster >>> >> is out-db. I'd expect a flat line with little change regardless the # >>> >> of bands. >>> >> >>> >> -bborie >>> >> >>> >> On 10/29/2012 04:23 PM, James Hiebert wrote: >>> >>>> If you've got a large number of bands (100s or more), you may want to >>> >>>> consider having the rasters be out-of-db. >>> >>> >>> >>> I had considered that (better, actually, than duplicating our data, >>> >>> right?), but was finding that st_intersects wasn't yet implemented for >>> >>> out of db storage. Looking through the trunk code, though, it appears >>> >>> that maybe you've gone ahead and implemented that since 2.0.1? If so, >>> >>> great! ST_PixelAsPoints() is another good reason for me to seriously >>> >>> consider working out of trunk... >>> >>> >>> >>>> Part of the problem is that >>> >>>> anything stored in PostgreSQL (in-db) is TOASTed so needs to be >>> >>>> deserialized (and probably decompressed). So, if the serialized raster >>> >>>> is big (more bands), the deTOASTing will take longer. >>> >>> >>> >>> Thanks; good to know. >>> >>> >>> >>>> Another problem with your benchmark query is that the ST_Clip() is >>> >>>> running twice (for height and width). >>> >>> >>> >>> Ah, that changes the picture pretty dramatically (see attached plot). >>> >>> Since it improves by a lot more than a factor of two, I suspect maybe >>> >>> I'm having some desktop scaling issues or something. I'll go ahead and >>> >>> actually put this on our database server, try the trunk version, and go >>> >>> from there. This is at least somewhat encouraging :) Thanks for the >>> >>> suggestions. >>> >>> >>> >>> ~James >>> >>> >>> >>> On Mon, Oct 29, 2012 at 03:50:04PM -0700, Bborie Park wrote: >>> >>>> James, >>> >>>> >>> >>>> I use PostGIS raster for a similar purpose (model outputs) though my >>> >>>> model outputs are for a specific day (average temperature for a >>> >>>> specific >>> >>>> date). So, one raster with one band per day per variable. I could >>> >>>> combine a year's worth of bands into one raster but I decided against >>> >>>> that. >>> >>>> >>> >>>> If you've got a large number of bands (100s or more), you may want to >>> >>>> consider having the rasters be out-of-db. Part of the problem is that >>> >>>> anything stored in PostgreSQL (in-db) is TOASTed so needs to be >>> >>>> deserialized (and probably decompressed). So, if the serialized raster >>> >>>> is big (more bands), the deTOASTing will take longer. >>> >>>> >>> >>>> Another problem with your benchmark query is that the ST_Clip() is >>> >>>> running twice (for height and width). >>> >>>> >>> >>>> If you're in the evaluation stage and you're compiling PostGIS >>> >>>> yourself, >>> >>>> I'd recommend trying SVN -trunk (will become 2.1) as it has additional >>> >>>> capabilities and performance improvements. I'm already using -trunk in >>> >>>> production as I needed the new features (full disclosure: I wrote >>> >>>> almost >>> >>>> the new features in -trunk). >>> >>>> >>> >>>> -bborie >>> >>>> >>> >>>> On 10/29/2012 03:32 PM, James Hiebert wrote: >>> >>>>> Hi All, >>> >>>>> >>> >>>>> I'm considering using PostGIS rasters for storage of raster data at >>> >>>>> my organization and I'm looking for some advice (or perhaps a reality >>> >>>>> check). I work for a region climate services provider and the vast >>> >>>>> majority of our data (by volume, not necessarily complexity) are >>> >>>>> output from climate models. These are generally a n-by-m raster with >>> >>>>> one band for each timestep. There could be upwards of 36k to 72k >>> >>>>> timesteps for a typical model run. We have hundreds of model runs. >>> >>>>> >>> >>>>> So my question is, is it insane to be thinking of storing that many >>> >>>>> bands in a PostGIS raster? Or more specifically, is this _not_ a use >>> >>>>> case for which PostGIS rasters were designed? I notice that most of >>> >>>>> the examples in the docs and in "PostGIS In Action" focus only on >>> >>>>> images and I can imagine that handling multispectral satellite images >>> >>>>> as being more of the intended use case. >>> >>>>> >>> >>>>> I did a little benchmarking of a typical use case of ours ("What's >>> >>>>> the average temperature inside a some polygon, e.g. a river basin?"). >>> >>>>> I noticed that the run time for doing a ST_Clip(raster, band, >>> >>>>> geometry) and ST_Intersects(raster, band, geometry) appears to be >>> >>>>> super-linear even when doing it on just a single band. I ran the >>> >>>>> following query: >>> >>>>> SELECT rid, st_height(st_clip(rast, 1, the_geom)), >>> >>>>> st_width(st_clip(rast, the_geom)) FROM basins INNER JOIN bcsd ON >>> >>>>> ST_Intersects(rast, 1, the_geom) WHERE rid = <rid> (where basins is >>> >>>>> table of river basins with one single polygon and bcsd is a table >>> >>>>> with a raster column "rast"). >>> >>>>> for a set of rasters with increasing number of bands, and the time to >>> >>>>> run the query is shown in the attached plot. Since the raster >>> >>>>> properties are presumably shared across all the bands, it seems odd >>> >>>>> to me that run time would increase. I would expect it to be >>> >>>>> _contant_ (with constant number of pixels), but I suppose that that's >>> >>>>> my own ignorance as to how the PG type extensions work? >>> >>>>> >>> >>>>> Comments or explanations are welcome. >>> >>>>> >>> >>>>> ~James >>> > >>> >>> -- >>> Bborie Park >>> Programmer >>> Center for Vectorborne Diseases >>> UC Davis >>> 530-752-8380 >>> bkp...@ucdavis.edu >>> _______________________________________________ >>> postgis-users mailing list >>> postgis-users@postgis.refractions.net >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> -- >> James Hiebert >> Lead, Computational Support >> Pacific Climate Impacts Consortium >> http://www.pacificclimate.org >> Room 112, University House 1, University of Victoria >> PO Box 1700 Sta CSC, Victoria, BC V8V 2Y2 >> E-mail: hieb...@uvic.ca >> Tel: (250) 472-4521 >> Fax: (250) 472-4830 >> _______________________________________________ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > -- > Bborie Park > Programmer > Center for Vectorborne Diseases > UC Davis > 530-752-8380 > bkp...@ucdavis.edu
-- Bborie Park Programmer Center for Vectorborne Diseases UC Davis 530-752-8380 bkp...@ucdavis.edu _______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users