On 04:25 pm, [email protected] wrote:
I'm bumping this PEP again in hopes of getting some feedback.
Thanks,
Eric
On Tue, Sep 8, 2009 at 23:52, Eric Pruitt <[email protected]>
wrote:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
� �In its present form, the subprocess.Popen implementation is prone
to
� �dead-locking and blocking of the parent Python script while waiting
on data
� �from the child process.
Motivation:
� �A search for "python asynchronous subprocess" will turn up numerous
� �accounts of people wanting to execute a child process and
communicate with
� �it from time to time reading only the data that is available
instead of
� �blocking to wait for the program to produce data [1] [2] [3]. �The
current
� �behavior of the subprocess module is that when a user sends or
receives
� �data via the stdin, stderr and stdout file objects, dead locks are
common
� �and documented [4] [5]. �While communicate can be used to alleviate
some of
� �the buffering issues, it will still cause the parent process to
block while
� �attempting to read data when none is available to be read from the
child
� �process.
Rationale:
� �There is a documented need for asynchronous, non-blocking
functionality in
� �subprocess.Popen [6] [7] [2] [3]. �Inclusion of the code would
improve the
� �utility of the Python standard library that can be used on Unix
based and
� �Windows builds of Python. �Practically every I/O object in Python
has a
� �file-like wrapper of some sort. �Sockets already act as such and
for
� �strings there is StringIO. �Popen can be made to act like a file by
simply
� �using the methods attached the the subprocess.Popen.stderr, stdout
and
� �stdin file-like objects. �But when using the read and write methods
of
� �those options, you do not have the benefit of asynchronous I/O. �In
the
� �proposed solution the wrapper wraps the asynchronous methods to
mimic a
� �file object.
Reference Implementation:
� �I have been maintaining a Google Code repository that contains all
of my
� �changes including tests and documentation [9] as well as blog
detailing
� �the problems I have come across in the development process [10].
� �I have been working on implementing non-blocking asynchronous I/O
in the
� �subprocess.Popen module as well as a wrapper class for
subprocess.Popen
� �that makes it so that an executed process can take the place of a
file by
� �duplicating all of the methods and attributes that file objects
have.
"Non-blocking" and "asynchronous" are actually two different things.
From the rest of this PEP, I think only a non-blocking API is being
introduced. I haven't looked beyond the PEP, though, so I might be
missing something.
� �There are two base functions that have been added to the
subprocess.Popen
� �class: Popen.send and Popen._recv, each with two separate
implementations,
� �one for Windows and one for Unix based systems. �The Windows
� �implementation uses ctypes to access the functions needed to
control pipes
� �in the kernel 32 DLL in an asynchronous manner. �On Unix based
systems,
� �the Python interface for file control serves the same purpose. �The
� �different implementations of Popen.send and Popen._recv have
identical
� �arguments to make code that uses these functions work across
multiple
� �platforms.
Why does the method for non-blocking read from a pipe start with an "_"?
This is the convention (widely used) for a private API. The name also
doesn't suggest that this is the non-blocking version of reading.
Similarly, the name "send" doesn't suggest that this is the non-blocking
version of writing.
� �When calling the Popen._recv function, it requires the pipe name be
� �passed as an argument so there exists the Popen.recv function that
passes
� �selects stdout as the pipe for Popen._recv by default.
�Popen.recv_err
� �selects stderr as the pipe by default. "Popen.recv" and
"Popen.recv_err"
� �are much easier to read and understand than "Popen._recv('stdout'
..." and
� �"Popen._recv('stderr' ..." respectively.
What about reading from other file descriptors? subprocess.Popen allows
arbitrary file descriptors to be used. Is there any provision here for
reading and writing non-blocking from or to those?
� �Since the Popen._recv function does not wait on data to be produced
� �before returning a value, it may return empty bytes.
�Popen.asyncread
� �handles this issue by returning all data read over a given time
� �interval.
Oh. Popen.asyncread? What's that? This is the first time the PEP
mentions it.
� �The ProcessIOWrapper class uses the asyncread and asyncwrite
functions to
� �allow a process to act like a file so that there are no blocking
issues
� �that can arise from using the stdout and stdin file objects
produced from
� �a subprocess.Popen call.
What's the ProcessIOWrapper class? And what's the asyncwrite function?
Again, this is the first time it's mentioned.
So, to sum up, I think my main comment is that the PEP seems to be
missing a significant portion of the details of what it's actually
proposing. I suspect that this information is present in the
implementation, which I have not looked at, but it probably belongs in
the PEP.
Jean-Paul
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com