On Tue, Jun 3, 2014 at 6:45 AM, Paul Rubin <no.email@nospam.invalid> wrote: >> - Thread-safe programming is easy to explain but devilishly >> difficult to get right. > > I keep hearing that but not encountering it. Yes there are classic > hazards from sharing mutable state between threads. However, it's > generally not too difficult to program in a style that avoids such > sharing. Have threads communicate by message passing with immutable > data in the messages, and things tend to work pretty straightforwardly.
It's more true on some systems than others. The issues of maintaining "safe" state are very similar in callback systems and threads; the main difference is that a single-threaded asyncio system becomes cooperative, where threading systems are (usually) preemptive. Preemption means you could get a context switch *anywhere*. (In Python, I think the rule is that thread switches can happen only between Python bytecodes, but that's still "anywhere" as far as your code's concerned.) That means you have to *keep* everything safe, rather than simply get it safe again. Cooperative multitasking means your function will run to completion before any other callback happens (or, at least, will get to a clearly defined yield point). That means you can muck state up all you like, and then fix it afterwards. In some ways, that's easier; but it has a couple of risks: firstly, if your code jumps out early somewhere, you might forget to fix the shared state, and only find out much later; and secondly, if your function takes a long time to execute, everything else stalls. So whichever way you do it, you still have to be careful - just careful of slightly different things. For instance, you might keep track of network activity as a potentially slow operation, and make sure you never block a callback waiting for a socket - but you might do a quick and simple system call, not realizing that it involves a directory that's mounted from a remote server. With threads, someone else will get priority as soon as you block, but with asyncio, you have to be explicit about everything that's done asynchronously. Threads are massively simpler if you have a top-down execution model for a relatively small number of clients. Works really nicely for a sequence of prompts - you just code it exactly as if you were using print() and input() and stuff, and then turn print() into a blocking socket write (or whatever your I/O is done over) and your input() into a blocking socket read with line splitting, and that's all the changes you need. (You could even replace the actual print and input functions, and use a whole block of code untouched.) Async I/O is massively simpler if you have very little state, and simply react to stimuli. Every client connects, authenticates, executes commands, and terminates its connection. If all you need to know is whether the client's authenticated or not (restricted commandset before login), asyncio will be really *really* easy, and threads are overkill. This is even more true if most of your clients are going to be massively idle most of the time, with just tiny queries coming in occasionally and getting responded to quickly. Both have their advantages and disadvantages. Learning both models is, IMO, worth doing; get to know them, then decide which one suits your project. >> Asyncio makes the prototype somewhat cumbersome to write. However, >> once it is done, adding features, stimuli and states is a routine >> matter. > > Having dealt with some node.js programs and the nest of callbacks they > morph into as the application gets more complicated, threads have their > advantages. I wrote an uberlite async I/O framework for my last job. Most of the work was done by the lower-level facilities (actual non-blocking I/O, etc), but basically, what I had was a single callback for each connection type and a dictionary of state for each connection (with a few exceptions - incoming UDP has no state, ergo no dict). Worked out beautifully simple; each run through the callback processed one logical action (eg a line of text arriving on a socket, terminated by newline), updated state if required, and returned, back to the main loop. Not all asyncio will fit into that sort of structure, but if it does fit, this keeps everything from getting out of hand. (Plus, keeping state in a separate dict rather than using closures and local variables meant I could update code while maintaining state. Not important for most Python projects, but it was for us.) Both have their merits. ChrisA -- https://mail.python.org/mailman/listinfo/python-list