Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
Doug Hellmann wrote: On Nov 24, 2014, at 12:57 PM, Mike Bayer wrote: On Nov 24, 2014, at 12:40 PM, Doug Hellmann wrote: This is a good point. I’m not sure we can say “we’ll only use explicit/implicit async in certain cases" because most of our apps actually mix the cases. We have WSGI apps that send RPC messages and we have other apps that receive RPC messages and operate on the database. Can we mix explicit and implicit operating models, or are we going to have to pick one way? If we have to pick one, the implicit model we’re currently using seems more compatible with all of the various libraries and services we depend on, but maybe I’m wrong? IMHO, in the ideal case, a single method shouldn’t be mixing calls to a set of database objects as well as calls to RPC APIs at the same time, there should be some kind of method boundary to cross. There’s a lot of ways to achieve that. The database calls are inside the method invoked through RPC. System 1 sends an RPC message (call or cast) to system 2 which receives that message and then does something with the database. Frequently “system 1” is an API layer service (mixing WSGI and RPC) and "system 2” is something like the conductor (mixing RPC and DB access). What is really needed is some way that code can switch between explicit yields and implicit IO on a per-function basis. Like a decorator for one or the other. The approach that Twisted takes of just using thread pools for those IO-bound elements that aren’t compatible with explicit yields is one way to do this. This might be the best way to go, if there are in fact issues with mixing in implicit async systems like eventlet. I can imagine, vaguely, that the eventlet approach of monkey patching might get in the way of things in this more complicated setup. Part of what makes this confusing for me is that there’s a lack of clarity over what benefits we’re trying to get from the async work. If the idea is, the GIL is evil so we need to ban the use of all threads, and therefore must use defer for all IO, then that includes database IO which means we theoretically benefit from eventlet monkeypatching - in the absence of truly async DBAPIs, this is the only way to have deferrable database IO. If the idea instead is, the code we write that deals with messaging would be easier to produce, organize, and understand given an asyncio style approach, but otherwise we aren’t terribly concerned what highly sequential code like database code has to do, then a thread pool may be fine. A lot of the motivation behind the explicit async changes started as a way to drop our dependency on eventlet because we saw it as blocking our move to Python 3. It is also true that a lot of people don’t like that eventlet monkeypatches system libraries, frequently inconsistently or incorrectly. Apparently the state of python 3 support for eventlet is a little better than it was when we started talking about this a few years ago, but the monkeypatching is somewhat broken. lifeless suggested trying to fix the monkeypatching, which makes sense. At the summit I think we agreed to continue down the path of supporting both approaches. The issues you’ve raised with using ORMs (or indeed, any IO-based libraries that don’t support explicit async) make me think we should reconsider that discussion with the additional information that didn’t come up in the summit conversation. I think victor is proposing fixes here recently, https://lists.secondlife.com/pipermail/eventletdev/2014-November/001195.html So that seems to be ongoing to fix up that support (the eventlet community is smaller and takes more time to accept pull requests and such, from what I've seen, but this is just how it works). Doug ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On Nov 24, 2014, at 12:57 PM, Mike Bayer wrote: > >> On Nov 24, 2014, at 12:40 PM, Doug Hellmann wrote: >> >> >> This is a good point. I’m not sure we can say “we’ll only use >> explicit/implicit async in certain cases" because most of our apps actually >> mix the cases. We have WSGI apps that send RPC messages and we have other >> apps that receive RPC messages and operate on the database. Can we mix >> explicit and implicit operating models, or are we going to have to pick one >> way? If we have to pick one, the implicit model we’re currently using seems >> more compatible with all of the various libraries and services we depend on, >> but maybe I’m wrong? > > IMHO, in the ideal case, a single method shouldn’t be mixing calls to a set > of database objects as well as calls to RPC APIs at the same time, there > should be some kind of method boundary to cross. There’s a lot of ways to > achieve that. The database calls are inside the method invoked through RPC. System 1 sends an RPC message (call or cast) to system 2 which receives that message and then does something with the database. Frequently “system 1” is an API layer service (mixing WSGI and RPC) and "system 2” is something like the conductor (mixing RPC and DB access). > > What is really needed is some way that code can switch between explicit > yields and implicit IO on a per-function basis. Like a decorator for one or > the other. > > The approach that Twisted takes of just using thread pools for those IO-bound > elements that aren’t compatible with explicit yields is one way to do this. > This might be the best way to go, if there are in fact issues with mixing > in implicit async systems like eventlet. I can imagine, vaguely, that the > eventlet approach of monkey patching might get in the way of things in this > more complicated setup. > > Part of what makes this confusing for me is that there’s a lack of clarity > over what benefits we’re trying to get from the async work. If the idea is, > the GIL is evil so we need to ban the use of all threads, and therefore must > use defer for all IO, then that includes database IO which means we > theoretically benefit from eventlet monkeypatching - in the absence of truly > async DBAPIs, this is the only way to have deferrable database IO. > > If the idea instead is, the code we write that deals with messaging would be > easier to produce, organize, and understand given an asyncio style approach, > but otherwise we aren’t terribly concerned what highly sequential code like > database code has to do, then a thread pool may be fine. A lot of the motivation behind the explicit async changes started as a way to drop our dependency on eventlet because we saw it as blocking our move to Python 3. It is also true that a lot of people don’t like that eventlet monkeypatches system libraries, frequently inconsistently or incorrectly. Apparently the state of python 3 support for eventlet is a little better than it was when we started talking about this a few years ago, but the monkeypatching is somewhat broken. lifeless suggested trying to fix the monkeypatching, which makes sense. At the summit I think we agreed to continue down the path of supporting both approaches. The issues you’ve raised with using ORMs (or indeed, any IO-based libraries that don’t support explicit async) make me think we should reconsider that discussion with the additional information that didn’t come up in the summit conversation. Doug > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 24, 2014, at 12:40 PM, Doug Hellmann wrote: > > > This is a good point. I’m not sure we can say “we’ll only use > explicit/implicit async in certain cases" because most of our apps actually > mix the cases. We have WSGI apps that send RPC messages and we have other > apps that receive RPC messages and operate on the database. Can we mix > explicit and implicit operating models, or are we going to have to pick one > way? If we have to pick one, the implicit model we’re currently using seems > more compatible with all of the various libraries and services we depend on, > but maybe I’m wrong? IMHO, in the ideal case, a single method shouldn’t be mixing calls to a set of database objects as well as calls to RPC APIs at the same time, there should be some kind of method boundary to cross. There’s a lot of ways to achieve that. What is really needed is some way that code can switch between explicit yields and implicit IO on a per-function basis. Like a decorator for one or the other. The approach that Twisted takes of just using thread pools for those IO-bound elements that aren’t compatible with explicit yields is one way to do this. This might be the best way to go, if there are in fact issues with mixing in implicit async systems like eventlet. I can imagine, vaguely, that the eventlet approach of monkey patching might get in the way of things in this more complicated setup. Part of what makes this confusing for me is that there’s a lack of clarity over what benefits we’re trying to get from the async work. If the idea is, the GIL is evil so we need to ban the use of all threads, and therefore must use defer for all IO, then that includes database IO which means we theoretically benefit from eventlet monkeypatching - in the absence of truly async DBAPIs, this is the only way to have deferrable database IO. If the idea instead is, the code we write that deals with messaging would be easier to produce, organize, and understand given an asyncio style approach, but otherwise we aren’t terribly concerned what highly sequential code like database code has to do, then a thread pool may be fine. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On Nov 24, 2014, at 11:30 AM, Jay Pipes wrote: > On 11/24/2014 10:43 AM, Mike Bayer wrote: >>> On Nov 24, 2014, at 9:23 AM, Adam Young wrote: >>> For pieces such as the Nova compute that talk almost exclusively on >>> the Queue, we should work to remove Monkey patching and use a clear >>> programming model. If we can do that within the context of >>> Eventlet, great. If we need to replace Eventlet with a different >>> model, it will be painful, but should be done. What is most >>> important is that we avoid doing hacks like we've had to do with >>> calls to Memcached and monkeypatching threading. >> >> Nova compute does a lot of relational database access and I’ve yet to >> see an explicit-async-compatible DBAPI other than psycopg2’s and >> Twisted abdbapi. Twisted adbapi appears just to throw regular >> DBAPIs into a thread pool in any case (see >> http://twistedmatrix.com/trac/browser/trunk/twisted/enterprise/adbapi.py), >> so given that awkwardness and lack of real async, if eventlet is >> dropped it would be best to use a thread pool for database-related >> methods directly. > > Hi Mike, > > Note that nova-compute does not do any direct database queries. All database > reads and writes actually occur over RPC APIs, via the conductor, either > directly over the conductor RPC API or indirectly via nova.objects. > > For the nova-api and nova-conductor services, however, yes, there is > direct-to-database communication that occurs, though the goal is to have only > the nova-conductor service eventually be the only service that directly > communicates with the database. This is a good point. I’m not sure we can say “we’ll only use explicit/implicit async in certain cases" because most of our apps actually mix the cases. We have WSGI apps that send RPC messages and we have other apps that receive RPC messages and operate on the database. Can we mix explicit and implicit operating models, or are we going to have to pick one way? If we have to pick one, the implicit model we’re currently using seems more compatible with all of the various libraries and services we depend on, but maybe I’m wrong? Doug > > Best, > -jay > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On 11/24/2014 10:43 AM, Mike Bayer wrote: On Nov 24, 2014, at 9:23 AM, Adam Young wrote: For pieces such as the Nova compute that talk almost exclusively on the Queue, we should work to remove Monkey patching and use a clear programming model. If we can do that within the context of Eventlet, great. If we need to replace Eventlet with a different model, it will be painful, but should be done. What is most important is that we avoid doing hacks like we've had to do with calls to Memcached and monkeypatching threading. Nova compute does a lot of relational database access and I’ve yet to see an explicit-async-compatible DBAPI other than psycopg2’s and Twisted abdbapi. Twisted adbapi appears just to throw regular DBAPIs into a thread pool in any case (see http://twistedmatrix.com/trac/browser/trunk/twisted/enterprise/adbapi.py), so given that awkwardness and lack of real async, if eventlet is dropped it would be best to use a thread pool for database-related methods directly. Hi Mike, Note that nova-compute does not do any direct database queries. All database reads and writes actually occur over RPC APIs, via the conductor, either directly over the conductor RPC API or indirectly via nova.objects. For the nova-api and nova-conductor services, however, yes, there is direct-to-database communication that occurs, though the goal is to have only the nova-conductor service eventually be the only service that directly communicates with the database. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 24, 2014, at 9:23 AM, Adam Young wrote: > > > > For pieces such as the Nova compute that talk almost exclusively on the > Queue, we should work to remove Monkey patching and use a clear programming > model. If we can do that within the context of Eventlet, great. If we need > to replace Eventlet with a different model, it will be painful, but should be > done. What is most important is that we avoid doing hacks like we've had to > do with calls to Memcached and monkeypatching threading. Nova compute does a lot of relational database access and I’ve yet to see an explicit-async-compatible DBAPI other than psycopg2’s and Twisted abdbapi. Twisted adbapi appears just to throw regular DBAPIs into a thread pool in any case (see http://twistedmatrix.com/trac/browser/trunk/twisted/enterprise/adbapi.py), so given that awkwardness and lack of real async, if eventlet is dropped it would be best to use a thread pool for database-related methods directly. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 9:24 PM, Donald Stufft wrote: > > > There’s a long history of implicit context switches causing buggy software > that breaks. As far as I can tell the only downsides to explicit context > switches that don’t stem from an inferior interpreter seem to be “some > particular API in my head isn’t as easy with it” and “I have to type more > letters”. The first one I’d just say that constraints make the system and > that there are lots of APIs which aren’t really possible or easy in Python > because of one design decision or another. For the second one I’d say that > Python isn’t a language which attempts to make code shorter, just easier to > understand what is going to happen when. > > Throwing out hyperboles like “mathematically proven” isn’t a particular > valuable statement. It is *easier* to reason about what’s going to happen > with explicit context switches. Maybe you’re a better programmer than I am > and you’re able to keep in your head every place that might do an implicit > context switch in an implicit setup and you can look at a function and go “ah > yup, things are going to switch here and here”. I certainly can’t. I like my > software to maximize the ability to locally reason about a particular chunk > of code. But this is a false choice. There is a third way. It is, use explicit async for those parts of an application where it is appropriate; when dealing with message queues and things where jobs and messages are sent off for any amount of time to come back at some indeterminate point later, all of us would absolutely benefit from an explicit model w/ coroutines. If I was trying to write code that had to send off messages and then had to wait, but still has many more messages to send off, so that without async I’d need to be writing thread pools and all that, absolutely, async is a great programming model. But when the code digs into functions that are oriented around business logic, functions that within themselves are doing nothing concurrency-wise against anything else within them, and merely need to run step 1, 2, and 3, that don’t deal with messaging and instead talk to a single relational database connection, where explicit async would mean that a single business logic method would need to be exploded with literally many dozens of yields in it (with a real async DBAPI; every connection, every execute, every cursor close, every transaction start, every transaction end, etc.), it is completely cumbersome and unnecessary. These methods should run in an implicit async context. To that degree, the resistance that explicit async advocates have to the concept that both approaches should be switchable, and that one may be more appropriate than the other in difference cases, remains confusing to me. We from the threading camp are asked to accept that *all* of our programming models must change completely, but our suggestion that both models be integrated is met with, “well that’s wrong, because in my experience (doing this specific kind of programming), your model *never* works”. > > --- > Donald Stufft > PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On 11/23/2014 06:13 PM, Robert Collins wrote: On WSGI - if we're in an asyncio world, I don't think WSGI has any relevance today - it has no async programming model. While is has incremental apis and supports generators, thats not close enough to the same thing: so we're going to have to port our glue code to whatever container we end up with. As you know I'm pushing on a revamp of WSGI right now, and I'd be delighted to help put together a WSGI-for-asyncio PEP, but I think its best thought of as a separate thing to WSGI per se. It might be a profile of WSGI2 though, since there is quite some interest in truely async models. However I've a bigger picture concern. OpenStack only relatively recently switched away from an explicit async model (Twisted) to eventlet. I'm worried that this is switching back to something we switched away from (in that Twisted and asyncio have much more in common than either Twisted and eventlet w/magic, or asyncio and eventlet w/magic). We don't need to use this for WSGI applications. We need to use this for the non-api, message driven portions. WSGI applications should not be accepting events/messages. They already have a messaging model with HTTP, and we should use that and only that. We need to get the Web based services off Eventlet and into Web servers where we can make use of Native code for security reasons. Referencing the fine, if somewhat overused model from Ken Pepple: http://cdn2.hubspot.net/hub/344789/file-448028030-jpg/images/openstack-arch-grizzly-logical-v2.jpg?t=1414604346389 Only the Nova and Quantum (now Neutron, yes it is dated) API server shows an arrow coming out of the message queue. Those arrows should be broken. If we need to write a micro-service as a listener that receives an event off the queue and makes an HTTP call to an API server, let us do that. For pieces such as the Nova compute that talk almost exclusively on the Queue, we should work to remove Monkey patching and use a clear programming model. If we can do that within the context of Eventlet, great. If we need to replace Eventlet with a different model, it will be painful, but should be done. What is most important is that we avoid doing hacks like we've had to do with calls to Memcached and monkeypatching threading. Having a clear programming model around Messaging calls that scales should not compromise system integrity, it should complement it. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 9:09 PM, Mike Bayer wrote: > > >> On Nov 23, 2014, at 8:23 PM, Donald Stufft wrote: >> >> I don’t really take performance issues that seriously for CPython. If you >> care about performance you should be using PyPy. I like that argument though >> because the same argument is used against the GCs which you like to use as >> an example too. >> >> The verbosity isn’t really pointless, you have to be verbose in either >> situation, either explicit locks or explicit context switches. If you don’t >> have explicit locks you just have buggy software instead. > > Funny thing is that relational databases will lock on things whether or not > the calling code is using an async system. Locks are a necessary thing in > many cases. That lock-based concurrency code can’t be mathematically proven > bug free doesn’t detract from its vast usefulness in situations that are not > aeronautics or medical devices. Sure, databases will do it regardless so they aren’t a very useful topic of discussion here since their operation is external to the system being developed and they will operate the same regardless. There’s a long history of implicit context switches causing buggy software that breaks. As far as I can tell the only downsides to explicit context switches that don’t stem from an inferior interpreter seem to be “some particular API in my head isn’t as easy with it” and “I have to type more letters”. The first one I’d just say that constraints make the system and that there are lots of APIs which aren’t really possible or easy in Python because of one design decision or another. For the second one I’d say that Python isn’t a language which attempts to make code shorter, just easier to understand what is going to happen when. Throwing out hyperboles like “mathematically proven” isn’t a particular valuable statement. It is *easier* to reason about what’s going to happen with explicit context switches. Maybe you’re a better programmer than I am and you’re able to keep in your head every place that might do an implicit context switch in an implicit setup and you can look at a function and go “ah yup, things are going to switch here and here”. I certainly can’t. I like my software to maximize the ability to locally reason about a particular chunk of code. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 8:23 PM, Donald Stufft wrote: > > I don’t really take performance issues that seriously for CPython. If you > care about performance you should be using PyPy. I like that argument though > because the same argument is used against the GCs which you like to use as an > example too. > > The verbosity isn’t really pointless, you have to be verbose in either > situation, either explicit locks or explicit context switches. If you don’t > have explicit locks you just have buggy software instead. Funny thing is that relational databases will lock on things whether or not the calling code is using an async system. Locks are a necessary thing in many cases. That lock-based concurrency code can’t be mathematically proven bug free doesn’t detract from its vast usefulness in situations that are not aeronautics or medical devices. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 7:55 PM, Mike Bayer wrote: > >> >> On Nov 23, 2014, at 7:30 PM, Donald Stufft wrote: >> >> >>> On Nov 23, 2014, at 7:21 PM, Mike Bayer wrote: >>> >>> Given that, I’ve yet to understand why a system that implicitly defers CPU >>> use when a routine encounters IO, deferring to other routines, is relegated >>> to the realm of “magic”. Is Python reference counting and garbage >>> collection “magic”?How can I be sure that my program is only declaring >>> memory, only as much as I expect, and then freeing it only when I >>> absolutely say so, the way async advocates seem to be about IO? Why would >>> a high level scripting language enforce this level of low-level bookkeeping >>> of IO calls as explicit, when it is 100% predictable and automatable ? >> >> The difference is that in the many years of Python programming I’ve had to >> think about garbage collection all of once. I’ve yet to write a non trivial >> implicit IO application where the implicit context switch didn’t break >> something and I had to think about adding explicit locks around things. > > that’s your personal experience, how is that an argument? I deal with the > Python garbage collector, memory management, etc. *all the time*. I have a > whole test suite dedicated to ensuring that SQLAlchemy constructs tear > themselves down appropriately in the face of gc and such: > https://github.com/zzzeek/sqlalchemy/blob/master/test/aaa_profiling/test_memusage.py > . This is the product of tons of different observed and reported issues > about this operation or that operation forming constructs that would take up > too much memory, wouldn’t be garbage collected when expected, etc. > > Yet somehow I still value very much the work that implicit GC does for me and > I understand well when it is going to happen. I don’t decide that that whole > world should be forced to never have GC again. I’m sure you wouldn’t be > happy if I got Guido to drop garbage collection from Python because I showed > how sometimes it makes my life more difficult, therefore we should all be > managing memory explicitly. Eh, Maybe you need to do that, that’s fine I suppose. Though the option isn’t between something with a very clear failure condition and something with a “weird things start happening” error condition. It’s between “weird things start happening” and “weird things start happening, just they are less likely to happen less”. Implicit context switches introduce a new harder to debug failure mode over blocking code that explicit context switches do not. > > I’m sure my agenda here is pretty transparent. If explicit async becomes the > only way to go, SQLAlchemy basically closes down. I’d have to rewrite it > completely (after waiting for all the DBAPIs that don’t exist to be written, > why doesn’t anyone ever seem to be concerned about that?) , and it would run > much less efficiently due to the massive amount of additional function call > overhead incurred by the explicit coroutines. It’s a pointless amount of > verbosity within a scripting language. I don’t really take performance issues that seriously for CPython. If you care about performance you should be using PyPy. I like that argument though because the same argument is used against the GCs which you like to use as an example too. The verbosity isn’t really pointless, you have to be verbose in either situation, either explicit locks or explicit context switches. If you don’t have explicit locks you just have buggy software instead. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 7:30 PM, Donald Stufft wrote: > > >> On Nov 23, 2014, at 7:21 PM, Mike Bayer wrote: >> >> Given that, I’ve yet to understand why a system that implicitly defers CPU >> use when a routine encounters IO, deferring to other routines, is relegated >> to the realm of “magic”. Is Python reference counting and garbage >> collection “magic”?How can I be sure that my program is only declaring >> memory, only as much as I expect, and then freeing it only when I absolutely >> say so, the way async advocates seem to be about IO? Why would a high >> level scripting language enforce this level of low-level bookkeeping of IO >> calls as explicit, when it is 100% predictable and automatable ? > > The difference is that in the many years of Python programming I’ve had to > think about garbage collection all of once. I’ve yet to write a non trivial > implicit IO application where the implicit context switch didn’t break > something and I had to think about adding explicit locks around things. that’s your personal experience, how is that an argument? I deal with the Python garbage collector, memory management, etc. *all the time*. I have a whole test suite dedicated to ensuring that SQLAlchemy constructs tear themselves down appropriately in the face of gc and such: https://github.com/zzzeek/sqlalchemy/blob/master/test/aaa_profiling/test_memusage.py . This is the product of tons of different observed and reported issues about this operation or that operation forming constructs that would take up too much memory, wouldn’t be garbage collected when expected, etc. Yet somehow I still value very much the work that implicit GC does for me and I understand well when it is going to happen. I don’t decide that that whole world should be forced to never have GC again. I’m sure you wouldn’t be happy if I got Guido to drop garbage collection from Python because I showed how sometimes it makes my life more difficult, therefore we should all be managing memory explicitly. I’m sure my agenda here is pretty transparent. If explicit async becomes the only way to go, SQLAlchemy basically closes down. I’d have to rewrite it completely (after waiting for all the DBAPIs that don’t exist to be written, why doesn’t anyone ever seem to be concerned about that?) , and it would run much less efficiently due to the massive amount of additional function call overhead incurred by the explicit coroutines. It’s a pointless amount of verbosity within a scripting language. > > Really that’s what it comes down to. Either you need to enable explicit > context switches (via callbacks or yielding, or whatever) or you need to add > explicit locks. Neither solution allows you to pretend that context switching > isn’t going to happen nor prevents you from having to deal with it. The > reason I prefer explicit async is because the failure mode is better (if I > forget to yield I don’t get the actual value so my thing blows up in > development) and it ironically works more like blocking programming because I > won’t get an implicit context switch in the middle of a function. Compare > that to the implicit async where the failure mode is that at runtime > something weird happens. > --- > Donald Stufft > PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 7:29 PM, Mike Bayer wrote: > >> >> Glyph wrote a good post that mirrors my opinions on implicit vs explicit >> here: https://glyph.twistedmatrix.com/2014/02/unyielding.html. > > this is the post that most makes me think about the garbage collector > analogy, re: “gevent works perfectly fine, but sorry, it just isn’t > “correct”. It should be feared! ”. Unfortunately Glyph has orders of > magnitude more intellectual capabilities than I do, so I am ultimately not an > effective advocate for my position; hence I have my fallback career as a > cheese maker lined up for when the async agenda finally takes over all > computer programming. Like I said, I’ve had to think about garbage collecting all of once in my entire Python career. Implicit might be theoretically nicer but until it can actually live up to the “gets out of my way-ness” of the abstractions you’re citing I’d personally much rather pass on it. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 7:21 PM, Mike Bayer wrote: > > Given that, I’ve yet to understand why a system that implicitly defers CPU > use when a routine encounters IO, deferring to other routines, is relegated > to the realm of “magic”. Is Python reference counting and garbage > collection “magic”?How can I be sure that my program is only declaring > memory, only as much as I expect, and then freeing it only when I absolutely > say so, the way async advocates seem to be about IO? Why would a high level > scripting language enforce this level of low-level bookkeeping of IO calls as > explicit, when it is 100% predictable and automatable ? The difference is that in the many years of Python programming I’ve had to think about garbage collection all of once. I’ve yet to write a non trivial implicit IO application where the implicit context switch didn’t break something and I had to think about adding explicit locks around things. Really that’s what it comes down to. Either you need to enable explicit context switches (via callbacks or yielding, or whatever) or you need to add explicit locks. Neither solution allows you to pretend that context switching isn’t going to happen nor prevents you from having to deal with it. The reason I prefer explicit async is because the failure mode is better (if I forget to yield I don’t get the actual value so my thing blows up in development) and it ironically works more like blocking programming because I won’t get an implicit context switch in the middle of a function. Compare that to the implicit async where the failure mode is that at runtime something weird happens. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 6:35 PM, Donald Stufft wrote: > > > For whatever it’s worth, I find explicit async io to be _way_ easier to > understand for the same reason I find threaded code to be a rats nest. web applications aren’t explicitly “threaded”. You get a request, load some data, manipulate it, and return a response. There are no threads to reason about, nothing is explicitly shared in any way. > > The co-routine style of asyncio (or Twisted’s inlineCallbacks) solves > almost all of the problems that I think most people have with explicit > asyncio (namely the callback hell) while still getting the benefits. coroutines are still “inside out” and still have all the issues discussed in http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html which I also refer to in http://stackoverflow.com/questions/16491564/how-to-make-sqlalchemy-in-tornado-to-be-async/16503103#16503103. > > Glyph wrote a good post that mirrors my opinions on implicit vs explicit > here: https://glyph.twistedmatrix.com/2014/02/unyielding.html. this is the post that most makes me think about the garbage collector analogy, re: “gevent works perfectly fine, but sorry, it just isn’t “correct”. It should be feared! ”. Unfortunately Glyph has orders of magnitude more intellectual capabilities than I do, so I am ultimately not an effective advocate for my position; hence I have my fallback career as a cheese maker lined up for when the async agenda finally takes over all computer programming. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 6:13 PM, Robert Collins wrote: > > > So - the technical bits of the plan sound fine. > > On WSGI - if we're in an asyncio world, *looks around*, we are? when did that happen?Assuming we’re talking explicit async. Rewriting all our code as verbose, “inside out” code, vast library incompatibility, and…some notion of “correctness” that somehow is supposed to be appropriate for a high level scripting language and can’t be achieved though simple, automated means such as gevent. > I don't think WSGI has any > relevance today - if you want async + wsgi, use gevent.wsgi. It is of course not explicit async but if the whole world decides that we all have to explicitly turn all of our code inside out to appease the concept of “oh no, IO IS ABOUT TO HAPPEN! ARE WE READY! ”, I am definitely quitting programming to become a cheese maker. If you’re writing some high performance TCP server thing, fine (…but... why are you writing a high performance server in Python and not something more appropriate like Go?). If we’re dealing with message queues as I know this thread is about, fine. But if you’re writing “receive a request, load some data, change some of it around, store it again, and return a result”, I don’t see why this has to be intentionally complicated. Use implicit async that can interact with the explicit async messaging stuff appropriately. That’s purportedly one of the goals of asyncIO (which Nick Coghlan had to lobby pretty hard for; source: http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html#gevent-and-pep-3156 ). > it has no async programming model. neither do a *lot* of things, including all traditional ORMs.I’m fine with Ceilometer dropping SQLAlchemy support as they prefer MongoDB and their relational database code is fairly wanting. Per http://aiogreen.readthedocs.org/openstack.html, I’m not sure how else they will drop eventlet support throughout the entire app. > While is has > incremental apis and supports generators, thats not close enough to > the same thing: so we're going to have to port our glue code to > whatever container we end up with. As you know I'm pushing on a revamp > of WSGI right now, and I'd be delighted to help put together a > WSGI-for-asyncio PEP, but I think its best thought of as a separate > thing to WSGI per se. given the push for explicit async, seems like lots of effort will need to be spent on this. > It might be a profile of WSGI2 though, since > there is quite some interest in truely async models. > > However I've a bigger picture concern. OpenStack only relatively > recently switched away from an explicit async model (Twisted) to > eventlet. hooray. efficient database access for explicit async code would be impossible otherwise as there are no explicit async APIs to MySQL, and only one for Postgresql which is extremely difficult to support. > > I'm worried that this is switching back to something we switched away > from (in that Twisted and asyncio have much more in common than either > Twisted and eventlet w/magic, or asyncio and eventlet w/magic). In the C programming world, when you want to do something as simple as create a list of records, it’s not so simple: you have to explicitly declare memory using malloc(), and organize your program skillfully and carefully such that this memory is ultimately freed using free(). It’s tedious and error prone. So in the scripting language world, these tedious, low level and entirely predictable steps are automated away for us; memory is declared automatically, and freed automatically. Even reference cycles are cleaned out for us without us even being aware. This is why we use “scripting languages” - they are intentionally automated to speed the pace of development and produce code that is far less verbose than low-level C code and much less prone to low-level errors, albeit considerably less efficient. It’s the payoff we make; predictable bookkeeping of the system’s resources are automated away. There’s a price; the Python interpreter uses a ton of memory and tends to not free memory once large chunks of it have been used by the application. The implicit allocation and freeing of memory has a huge tradeoff, in that the Python interpreter uses lots of memory pretty quickly. However, this tradeoff, Python’s clearly inefficient use of memory because it’s automating the management of it away for us, is one which nobody seems to mind at all. But when it comes to IO, the implicit allocation of IO and deferment of execution done by gevent has no side effect anywhere near as harmful as the Python interpreter’s huge memory consumption. Yet we are so afraid of it, so frightened that our code…written in a *high level scripting language*, might not be “correct”. We might not know that IO is about to happen! How is this different from the much more tangible and day-
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On 24 November 2014 at 12:30, Monty Taylor wrote: > I'm not going to comment on the pros and cons - I think we all know I'm > a fan of threads. But I have been around a while, so - for those who > haven't been: FWIW we have *threads* today as a programming model. The implementation is green, but the concepts we work with in the code are threads, threadpools and so forth. eventlet is an optimisation around some [minor] inefficiencies in Python, but it doesn't change the programming model - see dstuffts excellent link for details on that. I too will hold off from commentting on the pros and cons today; this isn't about good or bad, its about making sure this revisiting of a huge discussion and effort gets the right visibility. > The main 'winning' answer came down to twisted being very opaque for new > devs - while it's very powerful for experienced devs, we decided to opt > for eventlet which does not scare new devs with a completely different > programming model. (reactors and deferreds and whatnot) > > Now, I wouldn't say we _just_ ported from Twisted, I think we finished > that work about 4 years ago. :) Nova managed it in Jan 2011, so 3.5 mumblemumble. Near enough to 'just' :) -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On 24 November 2014 at 12:35, Donald Stufft wrote: > > For whatever it’s worth, I find explicit async io to be _way_ easier to > understand for the same reason I find threaded code to be a rats nest. > > The co-routine style of asyncio (or Twisted’s inlineCallbacks) solves > almost all of the problems that I think most people have with explicit > asyncio (namely the callback hell) while still getting the benefits. Sure. Note that OpenStack *was* using inlineCallbacks. > Glyph wrote a good post that mirrors my opinions on implicit vs explicit > here: https://glyph.twistedmatrix.com/2014/02/unyielding.html. That is, we chose " 4. and finally, implicit coroutines: Java’s “green threads”, Twisted’s Corotwine, eventlet, gevent, where any function may switch the entire stack of the current thread of control by calling a function which suspends it. " - the option that Glyph (and I too) would say to never ever choose. My concern isn't that asyncio is bad - its not. Its that we spent an awful lot of time and effort rewriting nova etc to be 'option 4', and we've no reason to believe that whatever it was that made that not work /for us/ has been fixed. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
> On Nov 23, 2014, at 6:30 PM, Monty Taylor wrote: > > On 11/23/2014 06:13 PM, Robert Collins wrote: >> On 24 November 2014 at 11:01, victor stinner >> wrote: >>> Hi, >>> >>> I'm happy to announce you that I just finished the last piece of the puzzle >>> to add support for trollius coroutines in Oslo Messaging! See my two >>> changes: >>> >>> * Add a new aiogreen executor: >>> https://review.openstack.org/#/c/136653/ >>> * Add an optional executor callback to dispatcher: >>> https://review.openstack.org/#/c/136652/ >>> >>> Related projects: >>> >>> * asyncio is an event loop which is now part of Python 3.4: >>> http://docs.python.org/dev/library/asyncio.html >>> * trollius is the port of the new asyncio module to Python 2: >>> http://trollius.readthedocs.org/ >>> * aiogreen implements the asyncio API on top of eventlet: >>> http://aiogreen.readthedocs.org/ >>> >>> For the long story and the full history of my work on asyncio in OpenStack >>> since one year, read: >>> http://aiogreen.readthedocs.org/openstack.html >>> >>> The last piece of the puzzle is the new aiogreen project that I released a >>> few days ago. aiogreen is well integrated and fully compatible with >>> eventlet, it can be used in OpenStack without having to modify code. It is >>> almost fully based on trollius, it just has a small glue to reuse eventlet >>> event loop (get read/write notifications of file descriptors). >>> >>> In the past, I tried to use the greenio project, which also implements the >>> asyncio API, but it didn't fit well with eventlet. That's why I wrote a new >>> project. >>> >>> Supporting trollius coroutines in Oslo Messaging is just the first part of >>> the global project. Here is my full plan to replace eventlet with asyncio. >> >> ... >> >> So - the technical bits of the plan sound fine. >> >> On WSGI - if we're in an asyncio world, I don't think WSGI has any >> relevance today - it has no async programming model. While is has >> incremental apis and supports generators, thats not close enough to >> the same thing: so we're going to have to port our glue code to >> whatever container we end up with. As you know I'm pushing on a revamp >> of WSGI right now, and I'd be delighted to help put together a >> WSGI-for-asyncio PEP, but I think its best thought of as a separate >> thing to WSGI per se. It might be a profile of WSGI2 though, since >> there is quite some interest in truely async models. >> >> However I've a bigger picture concern. OpenStack only relatively >> recently switched away from an explicit async model (Twisted) to >> eventlet. >> >> I'm worried that this is switching back to something we switched away >> from (in that Twisted and asyncio have much more in common than either >> Twisted and eventlet w/magic, or asyncio and eventlet w/magic). >> >> If Twisted was unacceptable to the community, what makes asyncio >> acceptable? [Note, I don't really understand why Twisted was moved >> away from, since our problem domain is such a great fit for reactor >> style programming - lots of networking, lots of calling of processes >> that may take some time to complete their work, and occasional DB >> calls [which are equally problematic in eventlet and in >> asyncio/Twisted]. So I'm not arguing against the move, I'm just >> concerned that doing it without addressing whatever the underlying >> thing was, will fail - and I'm also concerned that it will surprise >> folk - since there doesn't seem to be a cross project blueprint >> talking about this fairly fundamental shift in programming model. > > I'm not going to comment on the pros and cons - I think we all know I'm > a fan of threads. But I have been around a while, so - for those who > haven't been: > > When we started the project, nova used twisted and swift used eventlet. > As we've consistently endeavored to not have multiple frameworks, we > entered in to the project's first big flame war: > > "twisted vs. eventlet" > > It was _real_ fun, I promise. But a the heart was a question of whether > we were going to rewrite swift in twisted or rewrite nova in eventlet. > > The main 'winning' answer came down to twisted being very opaque for new > devs - while it's very powerful for experienced devs, we decided to opt > for eventlet which does not scare new devs with a completely different > programming model. (reactors and deferreds and whatnot) > > Now, I wouldn't say we _just_ ported from Twisted, I think we finished > that work about 4 years ago. :) > For whatever it’s worth, I find explicit async io to be _way_ easier to understand for the same reason I find threaded code to be a rats nest. The co-routine style of asyncio (or Twisted’s inlineCallbacks) solves almost all of the problems that I think most people have with explicit asyncio (namely the callback hell) while still getting the benefits. Glyph wrote a good post that mirrors my opinions on implicit vs explicit here: https://glyph.twistedmatrix.com/2014/02/u
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On 11/23/2014 06:13 PM, Robert Collins wrote: > On 24 November 2014 at 11:01, victor stinner > wrote: >> Hi, >> >> I'm happy to announce you that I just finished the last piece of the puzzle >> to add support for trollius coroutines in Oslo Messaging! See my two changes: >> >> * Add a new aiogreen executor: >> https://review.openstack.org/#/c/136653/ >> * Add an optional executor callback to dispatcher: >> https://review.openstack.org/#/c/136652/ >> >> Related projects: >> >> * asyncio is an event loop which is now part of Python 3.4: >> http://docs.python.org/dev/library/asyncio.html >> * trollius is the port of the new asyncio module to Python 2: >> http://trollius.readthedocs.org/ >> * aiogreen implements the asyncio API on top of eventlet: >> http://aiogreen.readthedocs.org/ >> >> For the long story and the full history of my work on asyncio in OpenStack >> since one year, read: >> http://aiogreen.readthedocs.org/openstack.html >> >> The last piece of the puzzle is the new aiogreen project that I released a >> few days ago. aiogreen is well integrated and fully compatible with >> eventlet, it can be used in OpenStack without having to modify code. It is >> almost fully based on trollius, it just has a small glue to reuse eventlet >> event loop (get read/write notifications of file descriptors). >> >> In the past, I tried to use the greenio project, which also implements the >> asyncio API, but it didn't fit well with eventlet. That's why I wrote a new >> project. >> >> Supporting trollius coroutines in Oslo Messaging is just the first part of >> the global project. Here is my full plan to replace eventlet with asyncio. > > ... > > So - the technical bits of the plan sound fine. > > On WSGI - if we're in an asyncio world, I don't think WSGI has any > relevance today - it has no async programming model. While is has > incremental apis and supports generators, thats not close enough to > the same thing: so we're going to have to port our glue code to > whatever container we end up with. As you know I'm pushing on a revamp > of WSGI right now, and I'd be delighted to help put together a > WSGI-for-asyncio PEP, but I think its best thought of as a separate > thing to WSGI per se. It might be a profile of WSGI2 though, since > there is quite some interest in truely async models. > > However I've a bigger picture concern. OpenStack only relatively > recently switched away from an explicit async model (Twisted) to > eventlet. > > I'm worried that this is switching back to something we switched away > from (in that Twisted and asyncio have much more in common than either > Twisted and eventlet w/magic, or asyncio and eventlet w/magic). > > If Twisted was unacceptable to the community, what makes asyncio > acceptable? [Note, I don't really understand why Twisted was moved > away from, since our problem domain is such a great fit for reactor > style programming - lots of networking, lots of calling of processes > that may take some time to complete their work, and occasional DB > calls [which are equally problematic in eventlet and in > asyncio/Twisted]. So I'm not arguing against the move, I'm just > concerned that doing it without addressing whatever the underlying > thing was, will fail - and I'm also concerned that it will surprise > folk - since there doesn't seem to be a cross project blueprint > talking about this fairly fundamental shift in programming model. I'm not going to comment on the pros and cons - I think we all know I'm a fan of threads. But I have been around a while, so - for those who haven't been: When we started the project, nova used twisted and swift used eventlet. As we've consistently endeavored to not have multiple frameworks, we entered in to the project's first big flame war: "twisted vs. eventlet" It was _real_ fun, I promise. But a the heart was a question of whether we were going to rewrite swift in twisted or rewrite nova in eventlet. The main 'winning' answer came down to twisted being very opaque for new devs - while it's very powerful for experienced devs, we decided to opt for eventlet which does not scare new devs with a completely different programming model. (reactors and deferreds and whatnot) Now, I wouldn't say we _just_ ported from Twisted, I think we finished that work about 4 years ago. :) Monty ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
On 24 November 2014 at 11:01, victor stinner wrote: > Hi, > > I'm happy to announce you that I just finished the last piece of the puzzle > to add support for trollius coroutines in Oslo Messaging! See my two changes: > > * Add a new aiogreen executor: > https://review.openstack.org/#/c/136653/ > * Add an optional executor callback to dispatcher: > https://review.openstack.org/#/c/136652/ > > Related projects: > > * asyncio is an event loop which is now part of Python 3.4: > http://docs.python.org/dev/library/asyncio.html > * trollius is the port of the new asyncio module to Python 2: > http://trollius.readthedocs.org/ > * aiogreen implements the asyncio API on top of eventlet: > http://aiogreen.readthedocs.org/ > > For the long story and the full history of my work on asyncio in OpenStack > since one year, read: > http://aiogreen.readthedocs.org/openstack.html > > The last piece of the puzzle is the new aiogreen project that I released a > few days ago. aiogreen is well integrated and fully compatible with eventlet, > it can be used in OpenStack without having to modify code. It is almost fully > based on trollius, it just has a small glue to reuse eventlet event loop (get > read/write notifications of file descriptors). > > In the past, I tried to use the greenio project, which also implements the > asyncio API, but it didn't fit well with eventlet. That's why I wrote a new > project. > > Supporting trollius coroutines in Oslo Messaging is just the first part of > the global project. Here is my full plan to replace eventlet with asyncio. ... So - the technical bits of the plan sound fine. On WSGI - if we're in an asyncio world, I don't think WSGI has any relevance today - it has no async programming model. While is has incremental apis and supports generators, thats not close enough to the same thing: so we're going to have to port our glue code to whatever container we end up with. As you know I'm pushing on a revamp of WSGI right now, and I'd be delighted to help put together a WSGI-for-asyncio PEP, but I think its best thought of as a separate thing to WSGI per se. It might be a profile of WSGI2 though, since there is quite some interest in truely async models. However I've a bigger picture concern. OpenStack only relatively recently switched away from an explicit async model (Twisted) to eventlet. I'm worried that this is switching back to something we switched away from (in that Twisted and asyncio have much more in common than either Twisted and eventlet w/magic, or asyncio and eventlet w/magic). If Twisted was unacceptable to the community, what makes asyncio acceptable? [Note, I don't really understand why Twisted was moved away from, since our problem domain is such a great fit for reactor style programming - lots of networking, lots of calling of processes that may take some time to complete their work, and occasional DB calls [which are equally problematic in eventlet and in asyncio/Twisted]. So I'm not arguing against the move, I'm just concerned that doing it without addressing whatever the underlying thing was, will fail - and I'm also concerned that it will surprise folk - since there doesn't seem to be a cross project blueprint talking about this fairly fundamental shift in programming model. -Rob -- Robert Collins Distinguished Technologistste HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging
Hi, I'm happy to announce you that I just finished the last piece of the puzzle to add support for trollius coroutines in Oslo Messaging! See my two changes: * Add a new aiogreen executor: https://review.openstack.org/#/c/136653/ * Add an optional executor callback to dispatcher: https://review.openstack.org/#/c/136652/ Related projects: * asyncio is an event loop which is now part of Python 3.4: http://docs.python.org/dev/library/asyncio.html * trollius is the port of the new asyncio module to Python 2: http://trollius.readthedocs.org/ * aiogreen implements the asyncio API on top of eventlet: http://aiogreen.readthedocs.org/ For the long story and the full history of my work on asyncio in OpenStack since one year, read: http://aiogreen.readthedocs.org/openstack.html The last piece of the puzzle is the new aiogreen project that I released a few days ago. aiogreen is well integrated and fully compatible with eventlet, it can be used in OpenStack without having to modify code. It is almost fully based on trollius, it just has a small glue to reuse eventlet event loop (get read/write notifications of file descriptors). In the past, I tried to use the greenio project, which also implements the asyncio API, but it didn't fit well with eventlet. That's why I wrote a new project. Supporting trollius coroutines in Oslo Messaging is just the first part of the global project. Here is my full plan to replace eventlet with asyncio. First part (in progress): add support for trollius coroutines - Prepare OpenStack (Oslo Messaging) to support trollius coroutines using ``yield``: explicit asynchronous programming. Eventlet is still supported, used by default, and applications and libraries don't need to be modified at this point. Already done: * Write the trollius project: port asyncio to Python 2 * Stabilize trollius API * Add trollius dependency to OpenStack * Write the aiogreen project to provide the asyncio API on top of eventlet To do: * Stabilize aiogreen API * Add aiogreen dependency to OpenStack * Write an aiogreen executor for Oslo Messaging: rewrite greenio executor to replace greenio with aiogreen Second part (to do): rewrite code as trollius coroutines Switch from implicit asynchronous programming (eventlet using greenthreads) to explicit asynchronous programming (trollius coroutines using ``yield``). Need to modify OpenStack Common Libraries and applications. Modifications can be done step by step, the switch will take more than 6 months. The first application candidate is Ceilometer. The Ceilometer project is young, developers are aware of eventlet issues and like Python 3, and Ceilometer don't rely so much on asynchronous programming: most time is spent into waiting the database anyway. The goal is to port Ceilometer to explicit asynchronous programming during the cycle of OpenStack Kilo. Some applications may continue to use implicit asynchronous programming. For example, nova is probably the most complex part beacuse it is and old project with a lot of legacy code, it has many drivers and the code base is large. To do: * Ceilometer: add trollius dependency and set the trollius event loop policy to aiogreen * Ceilometer: change Oslo Messaging executor from "eventlet" to "aiogreen" * Redesign the service class of Oslo Incubator to support aiogreen and/or trollius. Currently, the class is designed for eventlet. The service class is instanciated before forking, which requires hacks on eventlet to update file descriptors. * In Ceilometer and its OpenStack depedencencies: add new functions which are written with explicit asynchronous programming in mind (ex: trollius coroutines written with ``yield``). * Rewrite Ceilometer endpoints (RPC methods) as trollius coroutines. Questions: * What about WSGI? aiohttp is not compatible with trollius yet. * The quantity of code which need to be ported to asynchronous programming is unknown right now. * We should be prepared to see deadlocks. OpenStack was designed for eventlet which implicitly switch on blocking operations. Critical sections may not be protected with locks, or not the right kind of lock. * For performances, blocking operations can be executed in threads. OpenStack code is probably not thread-safe, which means new kinds of race conditions. But the code executed in threads will be explicitly scheduled to be executed in a thread (with ``loop.run_in_executor()``), so regressions can be easily identified. * This part will take a lot of time. We may need to split it into subparts to have milestones, which is more attractive for developers. Last part (to do): drop eventlet Replace aiogreen event loop with trollius event loop, drop aiogreen and drop eventlet at the end. This change will be done on applications one by one. This is no need t