On Mon, Apr 09, 2012 at 09:04:51AM -0700, J. Andrew Rogers wrote:
> On Mon, Apr 9, 2012 at 6:56 AM, Stefan Keller <[email protected]> wrote:
> >
> > A while ago I proposed the idea of SQLite as the "The Shapefile of the
> > future?" - and Im still supporting it, especially the Spatialite
> > extension.
> 
> 
> The idea of SQLite as a "shapefile" has merit but SQLite per se has
> limits that would make it a suboptimal choice and is missing some
> useful features. As a self-contained database engine it was not
> designed for the use case of being a geospatial storage format (e.g.
> R-tree is a poor index choice for this purpose).
> 
> We implement our own import/export format, which we will freeze and
> open source at some point. We convert some of the other popular
> formats to this format before doing anything with the data. There are
> two things that we needed as a practical operational matter that are
> difficult to find in common geospatial data file formats:
> 
> - The ability to deal with really huge data sets. Routinely wrangling
> countless terabytes of spatial data import/export requires a format
> that is amenable to slicing, dicing, concatenating, etc giant files
> with minimal muss and fuss in order to comply with various limits of
> systems and to parallelize processing. This puts some design
> requirements on the internal structure of the files.
> 
> - Read and write I/O throughput. Many storage formats are badly CPU
> bound on any decent storage system or have poor I/O access patterns
> that limit I/O throughput. Few if any common geospatial data formats
> were designed with this in mind but it is a major bottleneck. Not a
> big deal if you are dealing with a few gigabytes of data but it
> approaches intractability once you start dealing with really large
> quantities of data.
> 
> 
> In short, current formats are not designed to scale to practical use
> cases. For us, the easiest and most efficient solution was to roll our
> own without much consideration for existing standards. Most of the
> standards seem to be designed for, either by legacy or intent, trivial
> amounts of geo data that is infrequently processed.
> 
> I'd be interested in a practical and scalable standard for moving geo
> data around if one is actively being developed.
> 
> 
> -- 
> J. Andrew Rogers
> 
> _______________________________________________
> Geowanking mailing list
> [email protected]
> http://geowanking.org/mailman/listinfo/geowanking_geowanking.org

My efforts were guided by some of these same ideas. I wanted it to be built 
around a key-value store, and I wanted it to be aimed at large-scale, 
eventually-consistent management system architectures (along the lines of 
Mongo, Couch, etc.) yet still designed for feature and attribute data and 
useful without a DBMS (via cli tools, etc.).

Could you talk about some of the alternatives you explored? I'd be interested 
to hear what you tried and learned.

-R.

_______________________________________________
Geowanking mailing list
[email protected]
http://geowanking.org/mailman/listinfo/geowanking_geowanking.org

Reply via email to