Don wrote: > Do you know if there is much interest in greater parallelization?
Huge interest. Outside of the core libraries, GRASS is made up of ~500 individual processing modules, each doing their own thing well. Each utilizes their own algorithms and strategies which is why GRASS 7 has built-in support for OpenMP, pthreads, *and* OpenCL- the idea is that the right parallelization strategy can be matched to the nature of the problem which each individual module faces. Additionally our python scripting library has helper functions to make parallel discrete-processes easy to use, since a common use case is to run the same computation on three different Red,Green,Blue imagery bands, or all ~7-11 spectral bands from satellite data (e.g. LANDSAT). In those cases the number of natural processes are in the same neighborhood as the number of cores on a typical workstation, so backgrounding all but one of the jobs in bash or python then waiting for them all to finish works remarkably well and takes minimal programming effort and divergence from the single-thread case. That's not far from the MPI situation, instead of backgrounding jobs they could just as well be sent to other machines in the cluster. As Soeren mentioned the gmath and gpde libraries support OpenMP already; in addition Seth Price put together an OpenCL version of our r.sun module (GPU ray tracing sunbeams, seems like a natural fit!) but I/we still need to finish integrating that into the main build; and our r.mapcalc module has pthreads support. The r.mapcalc (raster array map calculator) case is a pretty typical one for GRASS modules actually, they are not entirely, but not far from, being I/O limited not CPU bound per se. For MPI this means that there's a *lot* of data to pass around the network, and unless you've got infiniband or some network infrastructure near to the speed of your RAID, I suspect you'll quickly saturate. The main highly-CPU-bound modules I am personally very keen to see get parallelized are our spline interpolation modules: v.surf.rst and v.surf.bspline. The LU decomposition parts of them are actually in the GRASS libraries not the modules, so that would also benefit e.g. v.vol.rst which does 3D voxel cube interpolations. The v.surf.rst module uses quadtree segmentation, and v.surf.bspline does its own splitting into ~ 12-120 processing segments, so those yell out to me as low hanging fruit. I am sure the vector network analysis modules could make good use of parallelization too, but I don't use those enough personally to be able to comment on their immediate needs and bottlenecks. Markus N. might be able to talk about what he's doing on the Top500 supercomputer (AIX); I'm not sure how much Maui/Torque or similar is handling the job submissions there and how much is manual scripting to break up/send out the jobs and then process the results. > And have the Intel compilers and MPI been used with GRASS? Yes, I've built GRASS with icc ver 12.1.3 (-O2 -xHost -ipo -static-intel -parallel -Wall). Considering the size of the GRASS codebase it might be a little surprising that there weren't more problems :), but we do try pretty hard to keep the code straight ANSI/C89 C, which helps a lot with portability. For GRASS + icc build notes see: http://thread.gmane.org/gmane.comp.gis.grass.devel/47823 For GRASS I generally need to keep a close eye on the Debian packaging, so typically build it will gcc; outside of GRASS for I do use ifort a lot, and there the OpenMP auto-vectorization works really great. I understand that's a bit easier to do for FORTRAN than C though. As for MPI, there's a MPI version of the above mentioned v.surf.rst module for GRASS 5 floating around somewhere (probably under its old name of 's.surf.rst'); I actually run a medium sized cluster in my day job which is ~85% MPI usage, but I've never really been tempted to use it for GRASS things.. for what I personally do often saturating all cores/CPUs on the local workstation is enough. Also, the cluster setup can be non-trival for new users (NFS mounts, ssh keys, etc..), so out-of-the-box "just works" OpenMP style multi-threading probably gets us better bang for the buck when trying to support the 'Desktop GIS' user case, which is probably the bulk of our end users. But don't get me wrong, if the long-running modules like the spline interpolations and the r.cost module for search-paths were MPI-enabled they'd certainly get used by our power users & teams using it for back-end server satellite image number crunching! regards, Hamish _______________________________________________ grass-dev mailing list grass-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-dev