Dear all,

On our Ubuntu server we are about to reorganize our GIS data in order to 
develop a more efficient and consistent solution for data storage in a mixed 
GIS environment.
By "mixed GIS environment" I mean that we have people working with GRASS, QGIS, 
PostGIS but also many people using R and maybe the largest fraction using ESRI 
products, furthermore we have people using ENIV, ERDAS and some other. Only few 
people (like me) actually work directly on the server...
Until now I stored "my" data mainly in GRASS (6/7) native format which I was 
very happy with. But I  guess our ESRI- and PostGIS-people would not accept 
that as a standard...

However, especially for time series data we cannot have several copies in 
different formats (tailor-made for each and every software).

So I started thinking: what would be the most efficient and convenient solution 
for storing a large amount of data (e.g. high resolution raster and vector data 
with national extent plus time series data) in a way that it is accessible for 
all (at least most) remote users (with different GIS software). As I am very 
fond of the temporal framework in GRASS 7 it would be a precondition that I can 
use these tools on the data without unreasonable performance loss. Another 
precondition would be that users at remote computers in our (MS Windows) 
network can have access to the data.

In general, four options come into my mind:

a)      Stick to GRASS native format and have one copy in another format

b)      Use the native formats the data come in (e.g. temperature and 
precipitation comes in zipped ascii-grid format)

c)       Use PostGIS as a backend for data storage (raster / vector) (linked by 
(r./v.external.*)

d)      Use another GDAL/OGR format for data storage (raster / vector) (linked 
by (r./v.external.*)

My question(s) are:
What solutions could you recommend or what solution did you choose?
Who is having experience with this kind of data management challenge?
How do externally linked data series perform compared to GRASS native?

I searched a bit the mailing list and found this: 
(http://osgeo-org.1560.x6.nabble.com/GRASS7-temporal-GIS-database-questions-td5054920.html)
 where Sören recommended "postgresql as temporal database backend". However I 
am not sure if that was meant only for the temporal metadata and not the 
rasters themselves...
Furthermore in the idea collection for the Temporal framework 
(http://grasswiki.osgeo.org/wiki/Time_series_development, Open issues section) 
limitations were mentioned regarding the number of files in a folder, which 
would be possibly a problem both for file based storage. The ext2 file system 
had ""soft" upper limit of about 10-15k files in a single directory" but 
theoretically many more where possible. Other file systems may allow for more I 
guess... Will usage of such big directories > 10,000 files lead to performance 
problems?

The "Working with external data in GRASS 7" - wiki entry 
(http://grasswiki.osgeo.org/wiki/Working_with_external_data_in_GRASS_7) covers 
the technical part (and to some degree performance issues) very well.  Would it 
be worth adding a part on the strategic considerations / pros and cons of using 
external data? Or is that too much user and format dependent?

Thanks for any feedback our thoughts around this topic...

Cheers
Stefan



_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

Reply via email to