[GRASS-user] Organizing spatial (time series) data for mixed GIS environments

Blumentrath, Stefan Tue, 03 Dec 2013 11:59:30 -0800

Dear all,

On our Ubuntu server we are about to reorganize our GIS data in order to 
develop a more efficient and consistent solution for data storage in a mixed 
GIS environment.
By "mixed GIS environment" I mean that we have people working with GRASS, QGIS, 
PostGIS but also many people using R and maybe the largest fraction using ESRI 
products, furthermore we have people using ENIV, ERDAS and some other. Only few 
people (like me) actually work directly on the server...
Until now I stored "my" data mainly in GRASS (6/7) native format which I was 
very happy with. But I  guess our ESRI- and PostGIS-people would not accept 
that as a standard...

However, especially for time series data we cannot have several copies in
different formats (tailor-made for each and every software).

So I started thinking: what would be the most efficient and convenient solution
for storing a large amount of data (e.g. high resolution raster and vector data
with national extent plus time series data) in a way that it is accessible for
all (at least most) remote users (with different GIS software). As I am very
fond of the temporal framework in GRASS 7 it would be a precondition that I can
use these tools on the data without unreasonable performance loss. Another
precondition would be that users at remote computers in our (MS Windows)
network can have access to the data.

In general, four options come into my mind:

a) Stick to GRASS native format and have one copy in another format

b) Use the native formats the data come in (e.g. temperature and
precipitation comes in zipped ascii-grid format)

c) Use PostGIS as a backend for data storage (raster / vector) (linked by
(r./v.external.*)

d) Use another GDAL/OGR format for data storage (raster / vector) (linked
by (r./v.external.*)

My question(s) are:
What solutions could you recommend or what solution did you choose?
Who is having experience with this kind of data management challenge?
How do externally linked data series perform compared to GRASS native?

I searched a bit the mailing list and found this:
(http://osgeo-org.1560.x6.nabble.com/GRASS7-temporal-GIS-database-questions-td5054920.html)
where Sören recommended "postgresql as temporal database backend". However I
am not sure if that was meant only for the temporal metadata and not the
rasters themselves...
Furthermore in the idea collection for the Temporal framework
(http://grasswiki.osgeo.org/wiki/Time_series_development, Open issues section)
limitations were mentioned regarding the number of files in a folder, which
would be possibly a problem both for file based storage. The ext2 file system
had ""soft" upper limit of about 10-15k files in a single directory" but
theoretically many more where possible. Other file systems may allow for more I
guess... Will usage of such big directories > 10,000 files lead to performance
problems?

The "Working with external data in GRASS 7" - wiki entry
(http://grasswiki.osgeo.org/wiki/Working_with_external_data_in_GRASS_7) covers
the technical part (and to some degree performance issues) very well. Would it
be worth adding a part on the strategic considerations / pros and cons of using
external data? Or is that too much user and format dependent?

Thanks for any feedback our thoughts around this topic...

Cheers
Stefan

_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

[GRASS-user] Organizing spatial (time series) data for mixed GIS environments

Reply via email to