hello Markus and Vaclav, thank you for your feedback. My answer is inline.
On Wed, May 1, 2019 at 6:03 PM Markus Metz <markus.metz.gisw...@gmail.com> wrote: > IMHO the overhead of calling GRASS modules is insignificant because it is > in the range of milliseconds. I am much more concerned whether executing a > GRASS module takes days or hours or minutes. > And the overhead is insignificant (not measurable) compared to actual > execution time for larger datasets/regions. I would argue that this depends on what you are doing. For a single GRASS Session using a really big computational region the overhead is obviously negligible; I wrote that in the initial post <https://gist.github.com/pmav99/8f4546fe15940b3cb7db0cfb65e18d33#is-this-truly-a-problem> too. But if you do need to massively parallelize GRASS, then the overhead of setting up the GRASS Session and/or calling GRASS modules might be measurable too. Regardless, the overhead - can be noticeable while doing exploratory analysis - can be significant while **developing** GRASS (e.g. when running tests). BTW, let us also keep in mind that the majority of the tests should be using really small maps/computational regions (note: they currently don't, but that's a different issue) which means that the impact of this overhead should be larger On Thu, May 2, 2019 at 4:49 AM Vaclav Petras <wenzesl...@gmail.com> wrote: > Hi Panos and Markus, > > I actually touched on this in my master's thesis [1, p. 54-58], > specifically on the subprocess call overhead (i.e. not import or > initialization overheads). I compared speed of calling subprocess in Python > to a Python function call. The reason was that I was calling GRASS modules > many times for small portions of my computational region, i.e. I was > changing region always to the area/object of interest within the actual > (user set) computational region. So, the overall process involved actually > many subprocess calls depending on the size of data. Unfortunately, I don't > have there a comparison of how the two cases (functions versus > subprocesses) would look like in terms of time spend for the whole process. > Again I would argue that the answer depends on what you are doing. Pansharpening a 100 pixel map, has a (comparatively) huge overhead. Pansharpening a landast tile, not so much. Regardless of that, I think we can all agree that directly calling a function implementing algorithm Foo is always going to be faster than calling a script that calls the same function. Unfortunately, and as you pointed out, perhaps most of the GRASS functionality is only accessible from the CLI and not through an API. > And speaking more generally, it seems to me that the functionality-CLI > coupling issue is what might me be partially fueling Facundo's GSoC > proposal (Python package for topology tools). There access to functionality > does not seem direct enough to the user-programmer with overhead of > subprocess call as well as related I/O cost, whether real or perceived, > playing a role. > I can't speak about Facundo. Nevertheless, whenever I try to work with the API, I do find it limited and it feels that it still has rough edges (e.g. #3833 <https://trac.osgeo.org/grass/ticket/3833> and #3845 <https://trac.osgeo.org/grass/ticket/3845> ). It very soon becomes clear that in order to get work done you need to use the Command Line Interface. As a programmer, I do find this annoying :P > Best, > Vaclav > > [1] Petras V. 2013. Building detection from aerial images in GRASS GIS > environment. Master’s thesis. Czech Technical University in Prague. > http://geo.fsv.cvut.cz/proj/dp/2013/vaclav-petras-dp-2013.pdf > Thank you for the link) all the best, Panos
_______________________________________________ grass-dev mailing list grass-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-dev