* Markus Neteler <[email protected]> [2018-02-24 23:00:32 +0100]:
On Sat, Feb 24, 2018 at 10:31 PM, Markus Metz <[email protected]> wrote:On Sat, Feb 24, 2018 at 10:06 PM, Nikos Alexandris <[email protected]>* Markus Metz <[email protected]> [2018-02-24 21:39:40 +0100]:On Sat, Feb 24, 2018 at 9:25 PM, Nikos Alexandris <[email protected]>...Looking at jobs logs, I read lots of ".gislock" lines. It might be some permission related issue. I partially operated directly (with my user-id) on many Locations. The operateor of the scheduler, has naturally, another user-id. I wonder if I should apply GRASS_SKIP_MAPSET_OWNER_CHECK=1 everywhere.No, you need to run each process in a unique temporary mapset.Yes, only that works. Be sure to have a sufficiently long random string to be used as temporary mapset name. You can for example the outout of mktemp --dry-run and add the machine name to it (and maybe the current time stamp cleaned for special chars) to avoid race conditions if you use a shared network storage.
Thanks M1/2.
Once you have the final result, change the current mapset with g.mapset to the common mapset where final results should stored and copy the final result from the temporary mapset to the current mapset (the mapset to hold the final results).(we have processed terabytes of LST data like this :-)
Just a pseudo-example: it would suffice then to,
save current region
for loop over something
...
CURRENT_MAPSET=$(g.mapset -p)
# a temporary Mapset
RANDOM_STRING=$(mktemp --dry-run |cut -d"." -f2)
grass -c $RANDOM_STRING
# do something
r.mask vector=VectorMap where="Attribute='Here'" &&
g.region zoom=MASK &&
r.zonal.stats cover=covermap base=basemap method=average output=outputmap
# back to "valid" Mapset
g.mapset $CURRENT_MAPSET
g.copy raster=outputmap@${RANDOM_STRING},outputmap
r.stats -acp in=outputmap out=report
r.mask -r
...
sleep 1
done
restore region
?
Alternatively/additionally, don't use the script grassXY to start a GRASS session, instead define the GRASS environment with custom scripts (one for the GRASS version to use, one for the database/location/mapset to use). This avoids race conditions on a HPC system. A unique temporary mapset for each process helps to avoid all sorts of concurrent access problems.
It mostly works for me with --exec. Mostly. That is, there are missing or empty WIND files, here and there, and .gislock related issues.
Markus MLet's expand this Wiki section a bit with our findings (I'll try to find my notes): https://grasswiki.osgeo.org/wiki/Parallel_GRASS_jobs#Cluster_and_Grid_computing markusN
signature.asc
Description: PGP signature
_______________________________________________ grass-user mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/grass-user
