What I was thinking of was actually a lower level of the HTTP
infrastructure, where you have to write something that parses the HTTP
protocol (and things like form input or other common request content
types). I'm personally pretty happy with the pull style HTTP parsing I
wrote as an example
(http://code.google.com/p/tulip/source/browse/examples/fetch3.py#112
-- note that this is a client but the server-side header parsing is
about the same).
Looks nice and clean. So this relies on StreamReader for the translation
of the push-style that comes out of the low-level transport
("data_received") to pull-style asyncio.StreamReader.readXXX(). Nice.
I thought about this a bit more and I think in the end it comes down
to one's preferred style for writing parsers. Take a parser for a
language like Python -- you can write it in pull style (the lexer
reads a character or perhaps a line at a time, the parser asks the
lexer for a token at a time) or in push style, using a parser
generator (like CPython's parser does). Actually, even there, you can
use one style for the lexer and another style for the parser.
Interesting analogy. Yes, seems language/syntax parsing a file is
similar to protocol parsing a wire-level stream transport. I wonder
about the "sending leg": with language parsers, this would be probably
the AST. With network protocols, it's more of producing a 2nd stream
conforming again to the same "syntax": for sending to the other peer.
Using push style, the state machine ends up being represented
explicitly in the form of state variables, e.g. "am I parsing the
status line", "am I parsing the headers", "have I seen the end of the
headers", in addition to some buffers holding a representation of the
stuff you've already parsed (completed headers, request
method/path/version) and the stuff you haven't parsed yet (e.g. the
next incomplete line). Typically those have to be represented as
instance variables on the Protocol (or some higher-level object with a
similar role).
Using pull style, you can often represent the state implicitly in the
form of program location; e.g. an HTTP request/response parser could
start with a readline() call to read the initial request/response,
then a loop reading the headers until a blank line is found, perhaps
an inner loop to handle continuation lines. The buffers may be just
local variables.
The ability to represent state machine states implicitly in program
location instead of explicit variables indeed seems higher-level / more
abstracted. I have never looked at it that way .. very interesting.
I am wondering what happens if you take timing constraints into account.
Eg. with WebSocket, for DoS protection, one might want to have the
initial opening handshake finished in <N seconds. Hence you want to
check after N seconds if state "HANDSHAKE_FINISHED" has been reached. A
yield from socket.read_handshake()
(simplified) will however just "block" infinitely. So I need a 2nd
coroutine for the timeout. And the timeout will need to check .. an
instance variable for state. Or can I have a timing out yield from?
I've only written a small amount of Android code but I sure remember
that it felt nearly impossible to follow the logic of a moderately
complex Android app -- whereas in pull style your abstractions nicely
correspond to e.g. classes or methods (or just functions), in Android
even the simplest logic seemed to be spread across many different
classes, with the linkage between them often expressed separately
(sometimes even in XML or some other dynamic configuration that
requires the reader to switch languages). But I'll add that this was
perhaps due to being a beginner in the Android world (and I haven't
taken it up since).
Thats also my experience (but I also have limited exposure to Android):
it can get unwieldly pretty quick.
How would you do a pull-style UI toolkit? Transforming each push-style
callback for UI widgets into pull-style code
yield from button1.onclick()
# handle button1 click
or
evt = yield from ui.onevent()
if evt.target == "button1" and evt.type == "click":
# handle button1 click
The latter leads to one massive, monolithic code block handling all UI
interaction. The former leads to many small "sequential" looking code
pieces .. similar to callbacks. And those "distributed" code pieces
somehow need to interact with each other.
FWIW, the - for me - most comfortable and managable way of doing UI is
via "reactive programming", e.g. in JavaScript http://knockoutjs.com/
Eg. say some "x" is changing asynchronously (like a UI input field
widget) and some "y" needs to be changed _whenever_ "x" changes (like a
UI label).
In reactive programming, I can basically write code
y = f(x)
and the reactive engine will _analyze_ that code, and hook up push-style
callback code under the hood, so that _whenever_ x changes, f() is
_automatically_ reapplied.
Probably better explained here:
http://peak.telecommunity.com/DevCenter/Trellis
MS also seems to like RP: http://rxpy.codeplex.com/
In UI, this, combined with data-binding for widgets from view models, is
a really clean and abstract way.
Well, in the browser world there's little choice but to continue on
the push-based path that was started over a decade ago. That doesn't
mean it's the best programming paradigm. :-)
No, it doesn't;)
I'd put it like this: the classical UI is push-style. As mentioned
above, it would be interesting how pull-style would look like. But the
reactive style gives me a superior way .. again, just my personal
experience/exposure/taste ..
WebSocket is essentially a HTTP compatible opening handshake, that when
finished, establishes a bidirectional, full-duplex, reliable, message based
channel.
So would it make sense to build a (modified) transport/protocol
abstraction on top of that for asyncio? It seems the API can't be the
I have done that for Twisted:
https://github.com/tavendo/AutobahnPython/tree/master/examples/twisted/websocket/wrapping
This allows you to run _any_ stream-based protocol _over_ WebSocket. Eg
here is a Terminal session to Linux from Browser:
http://picpaste.com/Clipboard24-LLwRCPKG.png
In fact, Twisted endpoints are so cool (sorry for mentioning this here),
I can also run WebSocket over any stream transport, like Unix domain
sockets, pipes, serial:
https://github.com/tavendo/AutobahnPython/tree/master/examples/twisted/websocket/echo_endpoints
same as the standard asyncio transport/protocol, because message
framing needs to be preserved, but you could probably start with (or
use a variant of) DatagramTransport and DatagramProtocol.
Thats the 2nd possibility: capture the essence of WebSocket in a
ReliableOrderedFullDuplexDatagramTransport (just kidding about the name;)
The thing is: the abstracted WebSocket semantics of "reliable, ordered
datagram" neither fit TCP nor UDP.
It does fit SCTP:
http://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol
SCTP is available natively
http://www.freebsd.org/cgi/man.cgi?query=sctp&sektion=4
in which case you need to use new kernel API (Posix sockets don't have
it) and it can be layered over UDP, which is used eg with the upcoming
WebRTC HTML5 standard:
http://tools.ietf.org/html/draft-ietf-rtcweb-data-channel-06
So I think it really could be worth to define an abstract interface in
asyncio with the appropiate semantics for transports to implement. And
implementing transports then could be WebSocket and SCTP or even
shared-memory message queues ..
Happy new year,
/Tobias