[OSGeo-Discuss] Benefits raster data on RDBMS

2008-11-03 Thread Gilberto Camara

Dear all

Concerning the benefits of having raster data
stored together with vector data in a spatial
database, let me first quote from an excellent
paper from the late Jim Gray
(Scientific Data Management in the Coming Decade):

  What’s wrong with files?
   Everything builds from files as a base. HDF uses files.
   Database systems use files. But, file systems have no
   metadata beyond a hierarchical directory structure and file
   names. They encourage a do-it-yourself- data-model that
   will not benefit from the growing suite of data analysis
   tools. They encourage do-it-yourself-access-methods that
   will not do parallel, associative, temporal, or spatial
   search. They also lack a high-level query language.
   Lastly, most file systems can manage millions of files, but
   by the time a file system can deal with billions of files, it
   has become a database system.

In other words, if you have substantial amounts of raster
data (as is increasingly the case in geospatial application),
you will need to develop a significant amount of software
to manage your files. Unless... your data is handled by a
raster-enabled spatial database.

Best Regards
Gilberto

--
===
Dr.Gilberto Camara
Director General
National Institute for Space Research (INPE)
Sao Jose dos Campos, Brazil

voice: +55-12-3945-6035
fax:   +55-12-3921-6455
web:   http://www.dpi.inpe.br/gilberto
blog:  http://techne-episteme.blogspot.com/


___
Discuss mailing list
Discuss@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/discuss


Re: [OSGeo-Discuss] Benefits raster data on RDBMS

2008-11-03 Thread Christopher Schmidt
On Mon, Nov 03, 2008 at 10:13:49AM -0200, Gilberto Camara wrote:
 Dear all
 
 Concerning the benefits of having raster data
 stored together with vector data in a spatial
 database, let me first quote from an excellent
 paper from the late Jim Gray
 (Scientific Data Management in the Coming Decade):
 
   What’s wrong with files?
Everything builds from files as a base. HDF uses files.
Database systems use files. But, file systems have no
metadata beyond a hierarchical directory structure and file
names. They encourage a do-it-yourself- data-model that
will not benefit from the growing suite of data analysis
tools. They encourage do-it-yourself-access-methods that
will not do parallel, associative, temporal, or spatial
search. They also lack a high-level query language.
Lastly, most file systems can manage millions of files, but
by the time a file system can deal with billions of files, it
has become a database system.
 
 In other words, if you have substantial amounts of raster
 data (as is increasingly the case in geospatial application),
 you will need to develop a significant amount of software
 to manage your files. Unless... your data is handled by a
 raster-enabled spatial database.

I don't see anything in that paragraph that indicates that storing the
*image data* in the database is important. (A link to the paper online
or something could change that, of course.) Specifically, I don't think
there's any doubt that if you have many-many files, it makes sense to
store the *queryable image information* -- things like spatial extent,
temporal extent, etc. -- belong in a database. The question is, in the
data column, do you store a File Path, or the Image Data? Until/Unless
databases get/have image manipulation tools directly, I can't see the 
value of storing the image data itself in the database.

The points above argue against file-system based metadata
storage/retrieval: sorting files by date, searching through index files,
etc., so far as I can tell, but I don't see a compelling argument for
image data in the database above.

Of course, this is assuming that the image data access pattern is the
same in the database and on disk: for example, storing GeoTIFF data,
then using GDAL to parse the string from the database as a GeoTIFF file.
If the database you're using has a different (faster) Image access
algorithm, then of course there can be benefits. However, those same
benefits could presumably be realized with sufficiently complete
libraries for accessing the image externally: If Oracles' Database
product, for example, internally tiles the image, and they had a library
to access the image in the same way, presumably you could store those
bits on disk as well. However, if that library depends internally on a
database, then integration of all points into the same database might
help in some ways.

In any case, I think there's obvious reasons to store your image
metadata in a database -- and *using the same tools for accessing the
images*, I don't think we've yet seen a compelling argument for storing
image blobs in the database. Of course, all things are not equal :)
If your database has built in MrSID support, for example, you could
imagine using Database Storage for Images, because you'd get the
automatic compression combined with the querying -- but that's not about
the Database Specifically, just the image storage/reading library that
comes along with it.

Regards,
-- 
Christopher Schmidt
Web Developer
___
Discuss mailing list
Discuss@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/discuss