[GRASS-dev] Re: grass-dev Digest, Vol 30, Issue 31

Laura Toma Mon, 13 Oct 2008 07:45:11 -0700

------------------------------
Message: 7
Date: Mon, 13 Oct 2008 09:12:54 +0200
From: Markus Metz <[EMAIL PROTECTED]>
Subject: [GRASS-dev] Re: big region r.watershed
To: [EMAIL PROTECTED], [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1



Hamish wrote:
Markus Metz wrote:
The original version uses very little memory, so assuming that GRASS
runs today on systems where at least 500MB RAM are available Ichangedthe parameters for the seg mode, more data are kept in memory,speeding
up the seg mode. Looking at other modules using the segment library
(e.g. v.surf.contour, r.cost), it seems that there is not oneuniversallyused setting, instead the segment parameters are tuned to eachmodule.The new settings work for me, but not necessarily for others, andmaybe
using 500MB is a bit much.
fwiw r.terraflow has a memory= option, the default is 300mb.
AFAIU, the bigger you make that, the smaller the on-disk tempfiles need
to be (ie work-around to keep tmp files <2gb for 32bit filesystems).
a number of modules like r.in.poly have a rows= option, which Ididn't
really understand until I got into the code. (hold at most that many
region rows (all columns) in memory at once). Interestingly thedefault
value has scaled quite well over the years.
and other modules like r.in.xyz have percent= (0-100) for how muchof the
map to keep in memory at once.
A default value that scales well over the years would bepreferable, but
performance of r.watershed.fast -m is really poor if whole columns (or
rows ) are kept in memory and much better if segments have equal
dimensions. Interestingly, segments of 200 rows and 200 columns are
processed fastest, faster than e.g. 150 rows and columns or 250rows and
columns. The more segments are kept in memory the better.
Right now I don't want to introduce a new option to give the user
control over how much memory is used (be it MB memory, number ofrows or
percent of the map) because I want to keep all options of
r.watershed.fast identical to the original version. I'm still nothappy
with the speed of the segmented version of r.watershed.fast, but at
least it is magnitudes faster than the in-memory version of theoriginal
r.watershed. Maybe the iostream library that came with r.terraflow can
be used for r.waterhed -m as well.

Markus

To use the Iostream library you need to change the underlyingalgorithm of watershed. Iostream implements streams (files on disk)and sorting streams. If you use Iostream you need to store thegrids in streams on disk, rather than 2d-arrays in memory. Onstreams random access is very expensive, so you need a way to expressthe computation as a sequence of sorting streams followed bysequential accesses to streams. This usually requires a completerewrite of the algorithm.


-Laura


_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

[GRASS-dev] Re: grass-dev Digest, Vol 30, Issue 31

Reply via email to