* Markus Metz <[email protected]> [2018-02-24 22:31:41 +0100]:

On Sat, Feb 24, 2018 at 10:06 PM, Nikos Alexandris <[email protected]>
wrote:

* Markus Metz <[email protected]> [2018-02-24 21:39:40 +0100]:

On Sat, Feb 24, 2018 at 9:25 PM, Nikos Alexandris <
[email protected]>
wrote:


Dear community,

I am asking for help to "debug" a situation.

One ETRS89-based location, one Mapset with hundreds of land cover raster
map tiles.

Then, hundreds of UTM Zones-based Locations, subset of the WRS2 grid.

Multiple `r.proj`es are running in parallel. However, only one at a time
from inside one Mapset inside the respective UTM-Zone-based Location,
requesting for one land cover raster map tile from the ETRS89-based

Location.


And, each `r.proj` is isolated running inside an independent docker

container.


To the question. Some tiles are cross-cut by more than one UTM-Zone.
Hence, it happens that many UTM-Zones will/might request to read the
same land cover raster map tile at the same time.

That is one write per target Mapset/Location, yet highly probable
concurrent read requests to the same raster map(s) in the source
Mapset/Location.

Is this bad?


Are you experiencing problems?


Yes.

(It's not easy for me to have access to the logs, as I
don't directly have access to the scheduler. I got a copy though and I
am reading through.)

Looking at jobs logs, I read lots of ".gislock" lines.
It might be some permission related issue. I partially operated directly
(with my user-id) on many Locations.

The operateor of the scheduler, has naturally, another user-id. I wonder
if I should apply GRASS_SKIP_MAPSET_OWNER_CHECK=1 everywhere.

No, you need to run each process in a unique temporary mapset. Once you
have the final result, change the current mapset with g.mapset to the
common mapset where final results should stored and copy the final result
from the temporary mapset to the current mapset (the mapset to hold the
final results).

That's smart! Thank for this precious tip.

Alternatively/additionally, don't use the script grassXY to start a GRASS
session, instead define the GRASS environment with custom scripts (one for
the GRASS version to use, one for the database/location/mapset to use).
This avoids race conditions on a HPC system. A unique temporary mapset for
each process helps to avoid all sorts of concurrent access problems.

This is something that I learned the hard way. I have to update all of
my scripts, step by step.

I wanted to have fine control and log details of processes. So, I built
up custom functions over `grassXY $MAPSET --exec`.

Nikos

[rest deleted]

Attachment: signature.asc
Description: PGP signature

_______________________________________________
grass-user mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/grass-user

Reply via email to