On Jun 12, 2014, at 7:37 AM, Jonathan Slenders <[email protected]> wrote:
> I'm very interested if anyone has faced the same problem. For what it's worth, we faced the same performance issue with twisted.web2, which is one of the reasons that twisted.web2 eventually got canned and we went back to incrementally maintaining twisted.web. There was an abstraction called "streams" which created a new Deferred for every read operation, and it was just painfully slow because of all the allocation and garbage collecting of billions of little Deferred objects. Really what you want is an API like fetch_into(collector) where "collector" is an object with row_received and query_complete methods. Then you don't need a Future for every single row (or batch of rows); you just get a method called for each row. Of course this makes it somewhat difficult to write a nice syntactic for-loop in a coroutine over the result set, but it is an open question how to resolve that :-). This is somewhat similar to the transport/protocol separation, just at the application layer. There's an ongoing branch (although it might be more accurate to call it a "research project") in Twisted as to how to fix this in a more general way than creating a new interface for every new form of variable-length data, and if it ever works out, I'll be sure to share the technique with the asyncio community. -glyph
