inhahe wrote: > hi, excuse my noobness, I have a few basic questions about twisted, or > probably > about web servers in general.
There's nothing web-specific in these questions that I see... they apply to any network service serving requests. > what is the advantage of using a single-threaded server? > > i figured it makes it more scalable because there's too much overhead to have > a > thread for each user when you have many simultaneous users. but a friend i'm > talking to now says that using i/o blocking threads is perfectly scalable for > a > large number of simultaneous users. That's basically right. Threads can be a scalability issue, particularly if you have many connections that are mostly idle — you end up with a lot of wasted memory (for stack space). Another problem with threads is non-determinism. You can't easy construct a test suite that will find every possible race condition, because a thread can be pre-empted at any time. In effect you have a state machine with a massive number of states, many more than necessary. With a single thread, you can simply and reliably test what happens when events happen a particular order. Personally, I find this latter advantage more compelling. The performance differences are in many respects minor (and not clearly one way), especially compared to the overhead of using Python over C/C++. I find it *much* easier to write and test non-threaded code (and for that matter, I find it much easier to write and test Python). No point worrying about performance unless I can be confident in the correctness :) > if that's true i can only see a disadvantage in using a single-threaded server > -- having to use deferreds and stuff to make things asynchronous That is a disadvantage. Creating lots of objects and calling lots of functions can be a performance issue in Python. Deferreds are *much* nicer than the obvious alternative (passing callback functions to functions that produce asynchronous results), though. Fundamentally, concurrent programming is more complex than non-concurrent. The question is which tradeoffs suit your problem best. > i also don't understand how you're supposed to use deferreds > the twisted doc says deferreds won't *make* your code asynchronous. so let's > say you have to do an sql query that takes 10 seconds, deferreds would be > useless for making that not block unless you have a way of making that sql > query non-blocking already? how is that done? do you run a separate thread > of > your own for each sql query? one thread for all sql queries? You've got it. If you have a blocking API, there's nothing you can do to make it non-blocking apart from running it in a thread (if it's kind enough to release the GIL) or running it in a subprocess (if you don't mind the overhead of spawning another process and the complexity of marshalling messages to it rather than simply sharing an address space). Note that a common compromise between “a separate thread of your own for each sql query” and “one thread for all sql queries” is a thread pool with a limited number of threads. This is what twisted.enterprise.adbapi basically does to run SQL using the standard Python DB-API. > also I wonder in an typical twisted app, just how slow should an operation be > before you use a deferred? what if a user enters a username and password and > i > have to look that up in the database. do i use a deferred? just how bad > should > the query be before using a deferred? The precise answer is: it depends. The short answer is: if it does I/O or is obviously slower than instant, then it's blocking and should be avoided (in your main thread). To be precise: it depends on your requirements: basically, what performance do you need? If a lookup in the database is only, say, 30ms, and you don't lots of concurrent requests, and they only need to do that one lookup, and you only need an average latency for replying to requests of 100ms, then you'd be pretty comfortable with just blocking for that lookup. Typically, anything that doesn't return immediately, for some value of “immediately”, is good to treat as blocking, and thus something to avoid in your main thread. Small writes to disk are often fast enough to count as “immediate”. Small reads that are probably cached in RAM by your OS might be too. Querying a database usually isn't. It depends on your exact situation, though. It sounds like you already have a good idea of the sorts of things to watch out for, though. Basically, there's no magic substitute for measuring actual performance, and asking yourself “is it good enough?” > (reading the twisted docs is like reading a brick wall for me, it would be > nice > if someone could just explain things to me in simple terms.) It sounds to me like you've actually understood things quite well. :) -Andrew. _______________________________________________ Twisted-web mailing list [email protected] http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
