On Mon, Apr 9, 2012 at 6:56 AM, Stefan Keller <[email protected]> wrote:
>
> A while ago I proposed the idea of SQLite as the "The Shapefile of the
> future?" - and Im still supporting it, especially the Spatialite
> extension.


The idea of SQLite as a "shapefile" has merit but SQLite per se has
limits that would make it a suboptimal choice and is missing some
useful features. As a self-contained database engine it was not
designed for the use case of being a geospatial storage format (e.g.
R-tree is a poor index choice for this purpose).

We implement our own import/export format, which we will freeze and
open source at some point. We convert some of the other popular
formats to this format before doing anything with the data. There are
two things that we needed as a practical operational matter that are
difficult to find in common geospatial data file formats:

- The ability to deal with really huge data sets. Routinely wrangling
countless terabytes of spatial data import/export requires a format
that is amenable to slicing, dicing, concatenating, etc giant files
with minimal muss and fuss in order to comply with various limits of
systems and to parallelize processing. This puts some design
requirements on the internal structure of the files.

- Read and write I/O throughput. Many storage formats are badly CPU
bound on any decent storage system or have poor I/O access patterns
that limit I/O throughput. Few if any common geospatial data formats
were designed with this in mind but it is a major bottleneck. Not a
big deal if you are dealing with a few gigabytes of data but it
approaches intractability once you start dealing with really large
quantities of data.


In short, current formats are not designed to scale to practical use
cases. For us, the easiest and most efficient solution was to roll our
own without much consideration for existing standards. Most of the
standards seem to be designed for, either by legacy or intent, trivial
amounts of geo data that is infrequently processed.

I'd be interested in a practical and scalable standard for moving geo
data around if one is actively being developed.


-- 
J. Andrew Rogers

_______________________________________________
Geowanking mailing list
[email protected]
http://geowanking.org/mailman/listinfo/geowanking_geowanking.org

Reply via email to