On Fri, May 5, 2017 at 5:07 AM, Vaclav Petras <[email protected]> wrote: > > On Wed, May 3, 2017 at 5:34 AM, Moritz Lennert < [email protected]> wrote: > > > > On 02/05/17 15:53, Vaclav Petras wrote: > >> I'm using pipe_command() which is just convenience function setting > >> stdout=PIPE. Similarly feed_command() is just setting stdin=PIPE which > >> I'm not using because I'm feeding the stdout of the other process > >> directly (stdin=first_process.stdout). What I don't understand, > >> regardless of using stdin=PIPE or stdin=first_process.stdout for the > >> second process, is what should be next. > > > > Do you really need the in_process.communicate() ? Here's what I used > > in a local script and it works, without communicate(). Then again, > > I don't think the data flowing through this pipe ever exceeded available memory. > > > > pin = gscript.pipe_command('v.db.select', > > map = firms_map, > > ... > > total_turnover_map = 'turnover_%s' % nace2 > > p = gscript.start_command('r.in.xyz', > > input_='-', > > stdin=pin.stdout, > > ... > > if p.wait() is not 0: > > gscript.fatal("Error in r.in.xyz with nace %s" % nace2) > > The Popen.wait() documentation [1] says: "Warning: This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that." > > And since I'm using stdout=PIPE (pipe_command()), I use communicate(). What troubles me is that Popen.communicate(input=None) documentation [2] says: "Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited." > > It says "data read", so it probably talks about stdout=PIPE when communicate() does not return None(s) but data (stdout=PIPE and communicate with the same process), i.e. it doesn't apply to this case and I don't have to be troubled. As for the wait(), I think that it may work (works most of the time), it is just not guaranteed to work with large data and it depends on how smart the OS will be.
Maybe it is safer to store the output of v.out.ascii in a temporary file, then use that file as input for r.in.xyz. You can then not only check if v.out.ascii finished successfully, but also use the percent option of r.in.xyz to reduce memory consumption for large computational regions. The percent option does not work when piping input to r.in.xyz. Markus M > > Vaclav > > [1] https://docs.python.org/2/library/subprocess.html#subprocess.Popen.wait > [2] https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate >
_______________________________________________ grass-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/grass-dev
