Hamish wrote: > > The function looks okay, although using Python threads may > > be a better approach in general. > > why so? for mapcalc_start() or start_command()?
For running multiple r.mapcalc processes concurrently. Eventually, there may be other long-running functions like mapcalc(), so we should look for a more general method than adding explicit foreground and background versions. start_command() is needed for scripts which write to stdin or read from stdout as the process is running. > > Beyond that, bear in mind that running concurrent processes > > improves latency at the expense of efficiency. Overall > > performance will typically be improved if you're using cores > > which would otherwise be idle, but be reduced if the system > > is under load. So I wouldn't recommend forcing scripts to use > > concurrency. > > ok, always give the user the choice, fair enough. The trick of > course is how to code that without duplicating the code all over > the place, but not making it too tricky to maintain either. One option is a thread-pool library which allows the user to queue commands for execution and wait for completion, and the library deals with the details of allocating threads. > another discussion topic that comes out of this is: to run in > parallel or serially by default? for r3.in.xyz.py I've set it > to default to workers=1, but for i.landsat.rgb.py I've set it > to run all three bands at once by default. There's no right > answer, but it would be good to present a consistent approach. The awkward case is when one program creates multiple child processes, and some of those create multiple threads or child processes. Ideally, you want to limit the cumulative total, but that isn't straightforward. A reasonable approximation can be obtained by only using additional threads if the system load average is below a given threshold. -- Glynn Clements <[email protected]> _______________________________________________ grass-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-dev
