On Sun, Sep 25, 2016 at 9:49 PM, Markus Neteler <nete...@osgeo.org> wrote: > On Fri, Sep 23, 2016 at 11:30 PM, Markus Metz > <markus.metz.gisw...@gmail.com> wrote: >> On Fri, Sep 23, 2016 at 11:22 PM, Markus Neteler <nete...@osgeo.org> wrote: >>> On Fri, Sep 23, 2016 at 11:05 PM, Markus Metz >>> <markus.metz.gisw...@gmail.com> wrote: >>>> On Fri, Sep 23, 2016 at 1:11 PM, Markus Neteler <nete...@osgeo.org> wrote: >>> ... >>>> Your motivation is to provide a specialized CLI interface for HPC >>>> processing? >>> >>> No, not the case. >>> >>>> We used GRASS with HPC processing for years and the >>>> problems we faced were causes by the HPC hardware and software >>>> infrastructure, not by GRASS. What exactly is the problem with using >>>> GRASS and HPC processing? >>> >>> There is no problem. There is just the issue that with an increasing >>> amount of additions (e.g. maybe the need to provide region/resolution >>> to individual modules for independent parallel processing without the >>> overhead of always opening a new mapset) >> >> Getting closer it seems. Can you quantify "the overhead of always >> opening a new mapset"? > > As an example, when aiming at processing all Sentinel-2 tiles > globally, we speak about currently 73000 scenes * up-to-16 tiles along > with global data, analysis on top of other global data is more complex > when doing each job in its own mapset and reintegrate it in a single > target mapset as if able to process then in parallel in one mapset by > simply specifying the respective region to the command of interest. > Yes, different from the current paradigm and not for G7.
from our common experience, I would say that creating separate mapsets is a safety feature. If anything goes wrong with that particular processing chain, cleaning up is easy, simply delete this particular mapset and run the job again, if possible on a different host/node (assuming that failed jobs are logged). Anyway, I would be surprised if the overhead of opening a separate mapset is measurable when processing all Sentinel-2 tiles globally. Reintegration into a single target mapset could cause problems with regard to IO saturation, but in a HPC environment temporary data always need to be copied to a final target location at some stage. The HPC system you are using now is most probably quite different from the one we used previously, so this is a lot of guessing, particularly about the storage location of temporary data (no matter if it is the same mapset or a separate mapset). To be continued in a GRASS+HPC thread? Markus M > > But my original comment was targeted at the increasing number of > module parameters and how to handle that (with some new HPC related > idea as an example). > > I'm fine to archive this question for now, it will likely come up > again in the future. > > markusN _______________________________________________ grass-dev mailing list grass-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-dev