Hi Stefan, there is a FOSS4G presentation online as well: http://elogeo.nottingham.ac.uk/xmlui/handle/url/288
Best regards Soeren 2013/12/4 Sören Gebbert <soerengebb...@googlemail.com>: > Hi Stefan, > > 2013/12/3 Blumentrath, Stefan <stefan.blumentr...@nina.no>: >> Dear all, >> >> >> >> On our Ubuntu server we are about to reorganize our GIS data in order to >> develop a more efficient and consistent solution for data storage in a mixed >> GIS environment. >> >> By “mixed GIS environment” I mean that we have people working with GRASS, >> QGIS, PostGIS but also many people using R and maybe the largest fraction >> using ESRI products, furthermore we have people using ENIV, ERDAS and some >> other. Only few people (like me) actually work directly on the server… >> >> Until now I stored “my” data mainly in GRASS (6/7) native format which I was >> very happy with. But I guess our ESRI- and PostGIS-people would not accept >> that as a standard… >> >> >> >> However, especially for time series data we cannot have several copies in >> different formats (tailor-made for each and every software). >> >> >> >> So I started thinking: what would be the most efficient and convenient >> solution for storing a large amount of data (e.g. high resolution raster and >> vector data with national extent plus time series data) in a way that it is >> accessible for all (at least most) remote users (with different GIS >> software). As I am very fond of the temporal framework in GRASS 7 it would >> be a precondition that I can use these tools on the data without >> unreasonable performance loss. Another precondition would be that users at >> remote computers in our (MS Windows) network can have access to the data. >> >> >> >> In general, four options come into my mind: >> >> a) Stick to GRASS native format and have one copy in another format >> >> b) Use the native formats the data come in (e.g. temperature and >> precipitation comes in zipped ascii-grid format) >> >> c) Use PostGIS as a backend for data storage (raster / vector) (linked >> by (r./v.external.*) >> >> d) Use another GDAL/OGR format for data storage (raster / vector) >> (linked by (r./v.external.*) >> >> >> >> My question(s) are: >> >> What solutions could you recommend or what solution did you choose? > > I would suggest to use r.external and uncompressed geotiff files for > raster data. But you have to make sure that external software does not > modify these files, or if they do, that the temporal framework is > triggered to update dependent space time raster datasets. > > I would suggest to use the native GRASS format, in case of vector > data. Hence vector data needs to be copied. But maybe PostgreSQL with > topology support will be a solution? I think Martin Landa may have an > opinion here. > >> >> Who is having experience with this kind of data management challenge? > > No experience here from my side. > >> How do externally linked data series perform compared to GRASS native? > > It will be slower than the native format for sure. But i don't know > how much slower. > >> >> >> I searched a bit the mailing list and found this: >> (http://osgeo-org.1560.x6.nabble.com/GRASS7-temporal-GIS-database-questions-td5054920.html) >> where Sören recommended “postgresql as temporal database backend”. However I >> am not sure if that was meant only for the temporal metadata and not the >> rasters themselves… > > My recommendation was related to the temporal metadata only. The > sqlite database will not scale very well for select requests if you > have more than 30,000 maps registered in your temporal database. > PostgreSQL will be much faster for select requests. But PostgreSQL > performs very badly in managing (insert, update, delete) many maps. I > am not sure what the reason for this is, but from my experience has > PostgreSQL a scaling problem with many tables. Hence if you do not > modify you data often, PostgreSQL is your temporal database backend of > choice. Otherwise i would recommend Sqlite, even if its slower for > select requests. > >> Furthermore in the idea collection for the Temporal framework >> (http://grasswiki.osgeo.org/wiki/Time_series_development, Open issues > > This discussion is pretty old and does not reflect the current > temporal framework implementation. Please have a look at the new > TGRASS paper: > https://www.sciencedirect.com/science/article/pii/S136481521300282X?np=y > and the Geostat workshop: > http://geostat-course.org/Topic_Gebbert > >> section) limitations were mentioned regarding the number of files in a >> folder, which would be possibly a problem both for file based storage. The >> ext2 file system had “"soft" upper limit of about 10-15k files in a single >> directory” but theoretically many more where possible. Other file systems >> may allow for more I guess… Will usage of such big directories > 10,000 >> files lead to performance problems? > > Modern file systems should not have problems with many files. I am > using ext4 and the temporal framework with 100.000 maps without > noticeable performance issues. > >> >> The “Working with external data in GRASS 7” – wiki entry >> (http://grasswiki.osgeo.org/wiki/Working_with_external_data_in_GRASS_7) >> covers the technical part (and to some degree performance issues) very well. >> Would it be worth adding a part on the strategic considerations / pros and >> cons of using external data? Or is that too much user and format dependent? > > It would be great if you could share your experience with us. :) > > Best regards > Soeren > >> >> >> >> Thanks for any feedback our thoughts around this topic… >> >> >> >> Cheers >> >> Stefan >> >> >> >> >> >> >> >> >> _______________________________________________ >> grass-user mailing list >> grass-user@lists.osgeo.org >> http://lists.osgeo.org/mailman/listinfo/grass-user _______________________________________________ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user