Thanks for your thoughtful reply, Dima!  I work on embedded systems as a
firmware engineer and I've found that all of our products use serial UART
and as you say, 99% of the HW is USB dongles, primarily from FTDI or
Silabs.  It seems that the development of the drivers was focused on using
the Windows/Linux serial abstractions to present a comfortable interface to
the user.  I am not interested in second guessing this decision and would
rather leave the USB layer alone and instead work with the drivers as
provided.

I believe that I can best explain my desire for an asyncio implementation
with a few examples.  Then I will work through the asyncio source code to
further my understanding of this abstraction (which is perhaps my most
favorite in programming!)

The ubiquitous implementation of serial comms in python, pySerial, allows
the user to write() bytes to a serial interface.  Because the buffer setup
is rather large, 4096 bytes, it is likely that the write function blocks
for the amount of time it takes to copy the transmitted bytes to the
outgoing buffer and will return well before the serial device has
transmitted the signal on its TX line.  To obtain a kind of
synchronization, the library defines a flush() method that waits until the
OS TX buffer is empty.  In the Windows implementation this is a polling
busy wait at 50ms intervals while in Linux it blocks on tcdrain().  My
method of waiting on the Windows OS event using
loop._proactor.wait_for_handle(overlapped_write.hEvent)
allows for the Windows implementation to discard the busy wait in favor of
event signaling.

Since pySerial isn't asyncio-ready, programmers needing some concurrency in
python would use threads or asyncio thread pools.  In the case of FW
engineers working with embedded systems, we may like to have an async
generator "task" that is reading a byte stream from a serial device as well
as an awaitable write method that completes when bytes have actually been
put on the transport.  It seems to me that Windows/POSIX OS are each
handling the completion of read/write events and that therefore python
implementations that create threads are inelegant.  For example, to adapt
pySerial to be asyncio friendly, there is a new project, aioserial
<https://github.com/changyuheng/aioserial.py>, that I will be contributing
to.  So far it uses thread pool to wrap the old pySerial library; it wraps
function calls in loop.run_in_executor() in order to return awaitables.  My
hope is that with guidance from the asyncio team I can bring a
well-supported async implementation to Python serial IO.

At this point, I admit that I may have lost perspective by working on
systems with 32K of RAM where each thread is absolutely precious, powerful,
and dangerous!  Perhaps these days it is OK to spawn new threads as needed
at runtime to wait on an OS thread that is itself waiting on a HW event.
If the consensus is that a python-thread-based approach is best, then we
don't need to look much further than wrapping IO in loop.run_in_executor()!
Nevertheless, I will continue to explore the implementation since I am
always interested in energy efficiency and beautiful abstraction.

My working implementation uses the _wait_for_handle() method of the
IocpProactor class defined here
<https://github.com/python/cpython/blob/a0ad63e70e3682cdf7e87e28091bb54fe12a2d4e/Lib/asyncio/windows_events.py#L704>.
Let's see how/why the proof of concept is working.

wait_for_handle() recieves an overlapped.hEvent created with win32
CreateEvent
<https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createeventa>
(note
that a total of two would be created, one for reads, one for writes). The
event is setup for signaling by using  SetCommMask
<https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setcommmask>
with
flags EV_RXFLAG | EV_TXEMPTY during initialization and then calling
WaitCommEvent
<https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-waitcommevent#parameters>
with
a reference to the overlapped each time new IO begins.  This will
cause the overlapped.hEvent
that wait_for_handle() receives to be signaled when the OS completes the IO.

_wait_for_handle() calls RegisterWaitWithQueue() which wraps the win32 API
RegisterWaitForSingleObject
<https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-registerwaitforsingleobject>.
The important bit here is that this API allows for registration of a
callback function to fire on completion of the event.  This callback will
by called with the lpParameter argument containing struct PostCallbackData
data = {CompletionPort, Overlapped}, *pdata; (line 355 of overlapped.c).
And so it gets called with the completion port of self._iocp and a unique
address, ov.address, which is NOT the overlapped structure we are
originally awaiting, according to the note at line 714: # We only create ov
so we can use ov.address as a key for the cache. \ ov =
_overlapped.Overlapped(NULL).

So we see how a callback is registered by the IocpProactor event loop, now
let's understand then how this causes the "awaitable future" to complete at
the python layer.

A "future" is created: f = _WaitHandleFuture(ov, handle, wait_handle, self,
loop=self._loop).  Importantly this calls the Win32 API CreateEvent
<https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createeventa>
-
for my purposes this seems redundant at first glance, but I am afraid that
it may be necessary due to the simple fact the WaitCommEvent does not take
a callback!  I will have to investigate further.  This "future" is an
instance of a subclass of _BaseWaitHandleFuture which defines a _poll()
method utilizing win32 WaitForSingleObject
<https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitforsingleobject>
to poll for signaled state: "If dwMilliseconds is zero, the function does
not enter a wait state if the object is not signaled; it always returns
immediately."

It's a bit hard to track down, but if I am understanding correctly, the
"super loop" of the IocpProactor is its own _poll().  It starts by calling
GetQueuedCompletionStatus
<https://docs.microsoft.com/en-us/windows/win32/api/ioapiset/nf-ioapiset-getqueuedcompletionstatus>
with
an infinite timeout.  This may answer one of my main curiosities: is this
how the asyncio loop waits for multiple events from multiple threads
without creating waiting threads of its own?  Anyway, it retrieves the
"future" from self._cache, the blank overlapped used as the cache key, 0,
and the finish_wait_for_handle(trans, key, ov) function created way back in
_wait_for_handle().

This callback wraps the default implementation of the
_BaseWaitHandleFuture._poll()
which wraps WaitForSingleObject, discussed above, and returns True if the
event is signaled or false otherwise (I believe false would be an error
condition?).  Recall that in my implementation, "event" at this stage
refers to a EV_TXEMPTY or EV_RXCHAR event, for example, setup by
WaitCommEvent
<https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-waitcommevent#parameters>
and
SetCommMask earlier.  The future's set_result() will be called with True
and appended to self._results.  Recall that wait_for_handle() returned this
very same future to my application layer earlier, so the call to set_result()
will cause the application's wait to end.

Although there may be gaps in my understanding of the asyncio IO Completion
Ports proactor implementation, by following the code I am confident that my
usage of IocpProactor.wait_for_handle() does not create threads in the
python layer.  Without this implementation, the programmer wishing to
manage concurrency with serial IO must resort to 1) creating and managing
an extra thread for each IO direction and device or 2) manually wrapping
the serial IO using loop.run_in_executor() or 3) using the aioserial
library that abstracts 2) for them.

I think that creating, managing, and destroying threads only to wait on a
few bytes to arrive over a 10KBps transport is overkill.

Is it possible that there is an approach better than using
wait_for_handle()?  For example, the loop.add_reader(fd, callback, *args)
API seems to satisfy my requirements but is not supported by IocpProactor
<https://docs.python.org/3/library/asyncio-platforms.html#windows>.  If
there is interest, I could look into adding IocpProactor for that
API. There is also the Streams
<https://docs.python.org/3/library/asyncio-platforms.html#windows>
abstraction that seems appropriate, but I could not figure out how to hook
into it with SetCommMask, WaitCommEvent, and the overlapped structures.
Yet another idea is to take what I have learned from the IocpProactor
internals and copy and expose them in simplified form for my own
implementation, though I'd still need a nice way to throw them on the loop.

A big thanks for following along and aiding my understanding of the asyncio
paradigm!

Cheers,
J.P. Hutchins

P.S.: I am focused on Windows because I am not so worried about the POSIX
implementation ;).  Embedded always has Windows running anyway.


On Wed, Aug 31, 2022 at 7:10 PM Dima Tisnek <dim...@gmail.com> wrote:

> A few thoughts from someone who worked with serial ports, parallel
> ports and USB and covered Linux and Windows for some these.
>
> 1. Serial ports are dead
> 2. UNIX and Windows implementations are fundamentally different.
> 3. Even within UNIX, there's quite a variety
>
> 1.
> The hardware serial ports still exist on some rare PC motherboards,
> but it's quite rate to actually use those.
> Instead, there are a lot of other ports where serial port abstraction
> can be used, in chronological order:
> * USB serial dongles
> * USB devices that integrate a microprocessor that's connected via its
> serial interface
> * USB devices that integrate a microprocessor with USB stack that
> fakes a serial port
> * Bluetooth devices following the above
> * Bluetooth devices with modem (acm, not serial port) interface
>
> Access to both USB and Bluetooth is done differently now, UMDF for
> USB, and I think something similar for bluetooth,
>
> 2.
> UNIX APIs are pretty consistent wrt. file descriptor use, even when
> there are major gotchas in the kernel mode (serial vs tty for
> example).
> Windows APIs are frankly all over the place. Their pipes are not the
> same as pipes, etc. Their UNIX-like APIs only work so far.
> For a random example, see e.g.
> https://github.com/microsoft/terminal/issues/262
>
> 3.
> There's classical UNIX, but then there were tons of improvements:
> Linux got epoll, aio, io_submit...
> Mac got AsyncBytes and something or other underneath
> *BSD for something or other, but a bit differently
> Thus, a "good" asyncio loop implementation is likely to use
> OS-specific primitives
>
> So, where does it leave you?
> If your aim is to contribute to asyncio, may I suggest that you find
> another target than serial interfaces.
> If your aim is to support some specific device -- follow how that
> device is connected to the machine: ioports? iomem? usb? bt? etc.
> If your aim is to achieve high-bandwidth or low-latency -- get close to
> hardware
> If your aim is to support, let's say 100 ports at once -- one of the
> two approaches above
> If I couldn't guess your aim, please explain why `asyncio` in the first
> place.
>
> Cheers,
> Dima Tisnek
>
> On Thu, Sep 1, 2022 at 2:58 AM J.P. Hutchins <jphutch...@gmail.com> wrote:
> >
> > Greetings!
> >
> > I would like to modify/replace an existing library, pySerial, to use
> asyncio in Windows/Mac/Linux.  I have a Windows implementation working by
> "listening for an event" like this:
> >
> > read_future = loop._proactor.wait_for_handle(overlapped_read.hEvent)
> >
> > Where overlapped_read is the OVERLAPPED structure (via ctypes or
> pywin32) and the event is setup previously, e.g. "received chars on the
> serial port" event here.
> >
> > My question is in regards to the best practices for awaiting an OS event
> providing for the most efficient and maintainable implementation.
> Reference to other multi-platform libraries or builtins that accomplish
> similar would be appreciated.
> >
> > Thanks for your time,
> > J.P. Hutchins
> > _______________________________________________
> > Async-sig mailing list -- async-sig@python.org
> > To unsubscribe send an email to async-sig-le...@python.org
> > https://mail.python.org/mailman3/lists/async-sig.python.org/
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>
_______________________________________________
Async-sig mailing list -- async-sig@python.org
To unsubscribe send an email to async-sig-le...@python.org
https://mail.python.org/mailman3/lists/async-sig.python.org/
Code of Conduct: https://www.python.org/psf/codeofconduct/

Reply via email to