Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Joshua Harlow

Doug Hellmann wrote:

On Nov 24, 2014, at 12:57 PM, Mike Bayer  wrote:


On Nov 24, 2014, at 12:40 PM, Doug Hellmann  wrote:


This is a good point. I’m not sure we can say “we’ll only use explicit/implicit 
async in certain cases" because most of our apps actually mix the cases. We 
have WSGI apps that send RPC messages and we have other apps that receive RPC 
messages and operate on the database. Can we mix explicit and implicit operating 
models, or are we going to have to pick one way? If we have to pick one, the 
implicit model we’re currently using seems more compatible with all of the various 
libraries and services we depend on, but maybe I’m wrong?

IMHO, in the ideal case, a single method shouldn’t be mixing calls to a set of 
database objects as well as calls to RPC APIs at the same time, there should be 
some kind of method boundary to cross.   There’s a lot of ways to achieve that.


The database calls are inside the method invoked through RPC. System 1 sends an RPC 
message (call or cast) to system 2 which receives that message and then does 
something with the database. Frequently “system 1” is an API layer service (mixing 
WSGI and RPC) and "system 2” is something like the conductor (mixing RPC and DB 
access).


What is really needed is some way that code can switch between explicit yields 
and implicit IO on a per-function basis.   Like a decorator for one or the 
other.

The approach that Twisted takes of just using thread pools for those IO-bound 
elements that aren’t compatible with explicit yields is one way to do this. 
This might be the best way to go, if there are in fact issues with mixing in 
implicit async systems like eventlet.  I can imagine, vaguely, that the 
eventlet approach of monkey patching might get in the way of things in this 
more complicated setup.

Part of what makes this confusing for me is that there’s a lack of clarity over 
what benefits we’re trying to get from the async work.  If the idea is, the GIL 
is evil so we need to ban the use of all threads, and therefore must use defer 
for all IO, then that includes database IO which means we theoretically benefit 
from eventlet monkeypatching  - in the absence of truly async DBAPIs, this is 
the only way to have deferrable database IO.

If the idea instead is, the code we write that deals with messaging would be 
easier to produce, organize, and understand given an asyncio style approach, 
but otherwise we aren’t terribly concerned what highly sequential code like 
database code has to do, then a thread pool may be fine.



A lot of the motivation behind the explicit async changes started as a way to 
drop our dependency on eventlet because we saw it as blocking our move to 
Python 3. It is also true that a lot of people don’t like that eventlet 
monkeypatches system libraries, frequently inconsistently or incorrectly.

Apparently the state of python 3 support for eventlet is a little better than 
it was when we started talking about this a few years ago, but the 
monkeypatching is somewhat broken. lifeless suggested trying to fix the 
monkeypatching, which makes sense. At the summit I think we agreed to continue 
down the path of supporting both approaches. The issues you’ve raised with 
using ORMs (or indeed, any IO-based libraries that don’t support explicit 
async) make me think we should reconsider that discussion with the additional 
information that didn’t come up in the summit conversation.



I think victor is proposing fixes here recently,

https://lists.secondlife.com/pipermail/eventletdev/2014-November/001195.html

So that seems to be ongoing to fix up that support (the eventlet 
community is smaller and takes more time to accept pull requests and 
such, from what I've seen, but this is just how it works).



Doug




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Doug Hellmann

On Nov 24, 2014, at 12:57 PM, Mike Bayer  wrote:

> 
>> On Nov 24, 2014, at 12:40 PM, Doug Hellmann  wrote:
>> 
>> 
>> This is a good point. I’m not sure we can say “we’ll only use 
>> explicit/implicit async in certain cases" because most of our apps actually 
>> mix the cases. We have WSGI apps that send RPC messages and we have other 
>> apps that receive RPC messages and operate on the database. Can we mix 
>> explicit and implicit operating models, or are we going to have to pick one 
>> way? If we have to pick one, the implicit model we’re currently using seems 
>> more compatible with all of the various libraries and services we depend on, 
>> but maybe I’m wrong?
> 
> IMHO, in the ideal case, a single method shouldn’t be mixing calls to a set 
> of database objects as well as calls to RPC APIs at the same time, there 
> should be some kind of method boundary to cross.   There’s a lot of ways to 
> achieve that.

The database calls are inside the method invoked through RPC. System 1 sends an 
RPC message (call or cast) to system 2 which receives that message and then 
does something with the database. Frequently “system 1” is an API layer service 
(mixing WSGI and RPC) and "system 2” is something like the conductor (mixing 
RPC and DB access).

> 
> What is really needed is some way that code can switch between explicit 
> yields and implicit IO on a per-function basis.   Like a decorator for one or 
> the other.
> 
> The approach that Twisted takes of just using thread pools for those IO-bound 
> elements that aren’t compatible with explicit yields is one way to do this.   
>   This might be the best way to go, if there are in fact issues with mixing 
> in implicit async systems like eventlet.  I can imagine, vaguely, that the 
> eventlet approach of monkey patching might get in the way of things in this 
> more complicated setup.
> 
> Part of what makes this confusing for me is that there’s a lack of clarity 
> over what benefits we’re trying to get from the async work.  If the idea is, 
> the GIL is evil so we need to ban the use of all threads, and therefore must 
> use defer for all IO, then that includes database IO which means we 
> theoretically benefit from eventlet monkeypatching  - in the absence of truly 
> async DBAPIs, this is the only way to have deferrable database IO.
> 
> If the idea instead is, the code we write that deals with messaging would be 
> easier to produce, organize, and understand given an asyncio style approach, 
> but otherwise we aren’t terribly concerned what highly sequential code like 
> database code has to do, then a thread pool may be fine.


A lot of the motivation behind the explicit async changes started as a way to 
drop our dependency on eventlet because we saw it as blocking our move to 
Python 3. It is also true that a lot of people don’t like that eventlet 
monkeypatches system libraries, frequently inconsistently or incorrectly.

Apparently the state of python 3 support for eventlet is a little better than 
it was when we started talking about this a few years ago, but the 
monkeypatching is somewhat broken. lifeless suggested trying to fix the 
monkeypatching, which makes sense. At the summit I think we agreed to continue 
down the path of supporting both approaches. The issues you’ve raised with 
using ORMs (or indeed, any IO-based libraries that don’t support explicit 
async) make me think we should reconsider that discussion with the additional 
information that didn’t come up in the summit conversation.

Doug

> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Mike Bayer

> On Nov 24, 2014, at 12:40 PM, Doug Hellmann  wrote:
> 
> 
> This is a good point. I’m not sure we can say “we’ll only use 
> explicit/implicit async in certain cases" because most of our apps actually 
> mix the cases. We have WSGI apps that send RPC messages and we have other 
> apps that receive RPC messages and operate on the database. Can we mix 
> explicit and implicit operating models, or are we going to have to pick one 
> way? If we have to pick one, the implicit model we’re currently using seems 
> more compatible with all of the various libraries and services we depend on, 
> but maybe I’m wrong?

IMHO, in the ideal case, a single method shouldn’t be mixing calls to a set of 
database objects as well as calls to RPC APIs at the same time, there should be 
some kind of method boundary to cross.   There’s a lot of ways to achieve that.

What is really needed is some way that code can switch between explicit yields 
and implicit IO on a per-function basis.   Like a decorator for one or the 
other.

The approach that Twisted takes of just using thread pools for those IO-bound 
elements that aren’t compatible with explicit yields is one way to do this. 
This might be the best way to go, if there are in fact issues with mixing in 
implicit async systems like eventlet.  I can imagine, vaguely, that the 
eventlet approach of monkey patching might get in the way of things in this 
more complicated setup.

Part of what makes this confusing for me is that there’s a lack of clarity over 
what benefits we’re trying to get from the async work.  If the idea is, the GIL 
is evil so we need to ban the use of all threads, and therefore must use defer 
for all IO, then that includes database IO which means we theoretically benefit 
from eventlet monkeypatching  - in the absence of truly async DBAPIs, this is 
the only way to have deferrable database IO.

If the idea instead is, the code we write that deals with messaging would be 
easier to produce, organize, and understand given an asyncio style approach, 
but otherwise we aren’t terribly concerned what highly sequential code like 
database code has to do, then a thread pool may be fine.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Doug Hellmann

On Nov 24, 2014, at 11:30 AM, Jay Pipes  wrote:

> On 11/24/2014 10:43 AM, Mike Bayer wrote:
>>> On Nov 24, 2014, at 9:23 AM, Adam Young  wrote:
>>> For pieces such as the Nova compute that talk almost exclusively on
>>> the Queue, we should work to remove Monkey patching and use a clear
>>> programming model.  If we can do that within the context of
>>> Eventlet, great.  If we need to replace Eventlet with a different
>>> model, it will be painful, but should be done.  What is most
>>> important is that we avoid doing hacks like we've had to do with
>>> calls to Memcached and monkeypatching threading.
>> 
>> Nova compute does a lot of relational database access and I’ve yet to
>> see an explicit-async-compatible DBAPI other than psycopg2’s and
>> Twisted abdbapi.   Twisted adbapi appears just to throw regular
>> DBAPIs into a thread pool in any case (see
>> http://twistedmatrix.com/trac/browser/trunk/twisted/enterprise/adbapi.py),
>> so given that awkwardness and lack of real async, if eventlet is
>> dropped it would be best to use a thread pool for database-related
>> methods directly.
> 
> Hi Mike,
> 
> Note that nova-compute does not do any direct database queries. All database 
> reads and writes actually occur over RPC APIs, via the conductor, either 
> directly over the conductor RPC API or indirectly via nova.objects.
> 
> For the nova-api and nova-conductor services, however, yes, there is 
> direct-to-database communication that occurs, though the goal is to have only 
> the nova-conductor service eventually be the only service that directly 
> communicates with the database.

This is a good point. I’m not sure we can say “we’ll only use explicit/implicit 
async in certain cases" because most of our apps actually mix the cases. We 
have WSGI apps that send RPC messages and we have other apps that receive RPC 
messages and operate on the database. Can we mix explicit and implicit 
operating models, or are we going to have to pick one way? If we have to pick 
one, the implicit model we’re currently using seems more compatible with all of 
the various libraries and services we depend on, but maybe I’m wrong?

Doug

> 
> Best,
> -jay
> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Jay Pipes

On 11/24/2014 10:43 AM, Mike Bayer wrote:

On Nov 24, 2014, at 9:23 AM, Adam Young  wrote:
For pieces such as the Nova compute that talk almost exclusively on
the Queue, we should work to remove Monkey patching and use a clear
programming model.  If we can do that within the context of
Eventlet, great.  If we need to replace Eventlet with a different
model, it will be painful, but should be done.  What is most
important is that we avoid doing hacks like we've had to do with
calls to Memcached and monkeypatching threading.


Nova compute does a lot of relational database access and I’ve yet to
see an explicit-async-compatible DBAPI other than psycopg2’s and
Twisted abdbapi.   Twisted adbapi appears just to throw regular
DBAPIs into a thread pool in any case (see
http://twistedmatrix.com/trac/browser/trunk/twisted/enterprise/adbapi.py),
so given that awkwardness and lack of real async, if eventlet is
dropped it would be best to use a thread pool for database-related
methods directly.


Hi Mike,

Note that nova-compute does not do any direct database queries. All 
database reads and writes actually occur over RPC APIs, via the 
conductor, either directly over the conductor RPC API or indirectly via 
nova.objects.


For the nova-api and nova-conductor services, however, yes, there is 
direct-to-database communication that occurs, though the goal is to have 
only the nova-conductor service eventually be the only service that 
directly communicates with the database.


Best,
-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Mike Bayer

> On Nov 24, 2014, at 9:23 AM, Adam Young  wrote:
> 
> 
> 
> For pieces such as the Nova compute that talk almost exclusively on the 
> Queue, we should work to remove Monkey patching and use a clear programming 
> model.  If we can do that within the context of Eventlet, great.  If we need 
> to replace Eventlet with a different model, it will be painful, but should be 
> done.  What is most important is that we avoid doing hacks like we've had to 
> do with calls to Memcached and monkeypatching threading.

Nova compute does a lot of relational database access and I’ve yet to see an 
explicit-async-compatible DBAPI other than psycopg2’s and Twisted abdbapi.   
Twisted adbapi appears just to throw regular DBAPIs into a thread pool in any 
case (see 
http://twistedmatrix.com/trac/browser/trunk/twisted/enterprise/adbapi.py), so 
given that awkwardness and lack of real async, if eventlet is dropped it would 
be best to use a thread pool for database-related methods directly.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Mike Bayer

> On Nov 23, 2014, at 9:24 PM, Donald Stufft  wrote:
> 
> 
> There’s a long history of implicit context switches causing buggy software 
> that breaks. As far as I can tell the only downsides to explicit context 
> switches that don’t stem from an inferior interpreter seem to be “some 
> particular API in my head isn’t as easy with it” and “I have to type more 
> letters”. The first one I’d just say that constraints make the system and 
> that there are lots of APIs which aren’t really possible or easy in Python 
> because of one design decision or another. For the second one I’d say that 
> Python isn’t a language which attempts to make code shorter, just easier to 
> understand what is going to happen when.
> 
> Throwing out hyperboles like “mathematically proven” isn’t a particular 
> valuable statement. It is *easier* to reason about what’s going to happen 
> with explicit context switches. Maybe you’re a better programmer than I am 
> and you’re able to keep in your head every place that might do an implicit 
> context switch in an implicit setup and you can look at a function and go “ah 
> yup, things are going to switch here and here”. I certainly can’t. I like my 
> software to maximize the ability to locally reason about a particular chunk 
> of code.

But this is a false choice.  There is a third way.  It is, use explicit async 
for those parts of an application where it is appropriate; when dealing with 
message queues and things where jobs and messages are sent off for any amount 
of time to come back at some indeterminate point later, all of us would 
absolutely benefit from an explicit model w/ coroutines.  If I was trying to 
write code that had to send off messages and then had to wait, but still has 
many more messages to send off, so that without async I’d need to be writing 
thread pools and all that, absolutely, async is a great programming model.

But when the code digs into functions that are oriented around business logic, 
functions that within themselves are doing nothing concurrency-wise against 
anything else within them, and merely need to run step 1, 2, and 3, 
that don’t deal with messaging and instead talk to a single relational database 
connection, where explicit async would mean that a single business logic method 
would need to be exploded with literally many dozens of yields in it (with a 
real async DBAPI; every connection, every execute, every cursor close, every 
transaction start, every transaction end, etc.), it is completely cumbersome 
and unnecessary.  These methods should run in an implicit async context. 

To that degree, the resistance that explicit async advocates have to the 
concept that both approaches should be switchable, and that one may be more 
appropriate than the other in difference cases, remains confusing to me.   We 
from the threading camp are asked to accept that *all* of our programming 
models must change completely, but our suggestion that both models be 
integrated is met with, “well that’s wrong, because in my experience (doing 
this specific kind of programming), your model *never* works”.   




> 
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-24 Thread Adam Young

On 11/23/2014 06:13 PM, Robert Collins wrote:

On WSGI - if we're in an asyncio world, I don't think WSGI has any
relevance today - it has no async programming model. While is has
incremental apis and supports generators, thats not close enough to
the same thing: so we're going to have to port our glue code to
whatever container we end up with. As you know I'm pushing on a revamp
of WSGI right now, and I'd be delighted to help put together a
WSGI-for-asyncio PEP, but I think its best thought of as a separate
thing to WSGI per se. It might be a profile of WSGI2 though, since
there is quite some interest in truely async models.

However I've a bigger picture concern. OpenStack only relatively
recently switched away from an explicit async model (Twisted) to
eventlet.

I'm worried that this is switching back to something we switched away
from (in that Twisted and asyncio have much more in common than either
Twisted and eventlet w/magic, or asyncio and eventlet w/magic).



We don't need to use this for WSGI applications.  We need to use this 
for the non-api, message driven portions. WSGI applications should not 
be accepting events/messages.  They already have a messaging model with 
HTTP, and we should use that and only that.


We need to get the Web based services off Eventlet and into Web servers 
where we can make use of Native code for security reasons.


Referencing the fine, if somewhat overused model from Ken Pepple:

http://cdn2.hubspot.net/hub/344789/file-448028030-jpg/images/openstack-arch-grizzly-logical-v2.jpg?t=1414604346389

Only the Nova and Quantum (now Neutron, yes it is dated) API server 
shows an arrow coming out of the message queue.  Those arrows should be 
broken.  If we need to write a micro-service as a listener that receives 
an event off the queue and makes an HTTP call to an API server, let us 
do that.



For pieces such as the Nova compute that talk almost exclusively on the 
Queue, we should work to remove Monkey patching and use a clear 
programming model.  If we can do that within the context of Eventlet, 
great.  If we need to replace Eventlet with a different model, it will 
be painful, but should be done.  What is most important is that we avoid 
doing hacks like we've had to do with calls to Memcached and 
monkeypatching threading.


Having a clear programming model around Messaging calls that scales 
should not compromise system integrity, it should complement it.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Donald Stufft

> On Nov 23, 2014, at 9:09 PM, Mike Bayer  wrote:
> 
> 
>> On Nov 23, 2014, at 8:23 PM, Donald Stufft  wrote:
>> 
>> I don’t really take performance issues that seriously for CPython. If you 
>> care about performance you should be using PyPy. I like that argument though 
>> because the same argument is used against the GCs which you like to use as 
>> an example too.
>> 
>> The verbosity isn’t really pointless, you have to be verbose in either 
>> situation, either explicit locks or explicit context switches. If you don’t 
>> have explicit locks you just have buggy software instead.
> 
> Funny thing is that relational databases will lock on things whether or not 
> the calling code is using an async system.  Locks are a necessary thing in 
> many cases.  That lock-based concurrency code can’t be mathematically proven 
> bug free doesn’t detract from its vast usefulness in situations that are not 
> aeronautics or medical devices.

Sure, databases will do it regardless so they aren’t a very useful topic of 
discussion here since their operation is external to the system being developed 
and they will operate the same regardless.

There’s a long history of implicit context switches causing buggy software that 
breaks. As far as I can tell the only downsides to explicit context switches 
that don’t stem from an inferior interpreter seem to be “some particular API in 
my head isn’t as easy with it” and “I have to type more letters”. The first one 
I’d just say that constraints make the system and that there are lots of APIs 
which aren’t really possible or easy in Python because of one design decision 
or another. For the second one I’d say that Python isn’t a language which 
attempts to make code shorter, just easier to understand what is going to 
happen when.

Throwing out hyperboles like “mathematically proven” isn’t a particular 
valuable statement. It is *easier* to reason about what’s going to happen with 
explicit context switches. Maybe you’re a better programmer than I am and 
you’re able to keep in your head every place that might do an implicit context 
switch in an implicit setup and you can look at a function and go “ah yup, 
things are going to switch here and here”. I certainly can’t. I like my 
software to maximize the ability to locally reason about a particular chunk of 
code.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Mike Bayer

> On Nov 23, 2014, at 8:23 PM, Donald Stufft  wrote:
> 
> I don’t really take performance issues that seriously for CPython. If you 
> care about performance you should be using PyPy. I like that argument though 
> because the same argument is used against the GCs which you like to use as an 
> example too.
> 
> The verbosity isn’t really pointless, you have to be verbose in either 
> situation, either explicit locks or explicit context switches. If you don’t 
> have explicit locks you just have buggy software instead.

Funny thing is that relational databases will lock on things whether or not the 
calling code is using an async system.  Locks are a necessary thing in many 
cases.  That lock-based concurrency code can’t be mathematically proven bug 
free doesn’t detract from its vast usefulness in situations that are not 
aeronautics or medical devices.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Donald Stufft

> On Nov 23, 2014, at 7:55 PM, Mike Bayer  wrote:
> 
>> 
>> On Nov 23, 2014, at 7:30 PM, Donald Stufft  wrote:
>> 
>> 
>>> On Nov 23, 2014, at 7:21 PM, Mike Bayer  wrote:
>>> 
>>> Given that, I’ve yet to understand why a system that implicitly defers CPU 
>>> use when a routine encounters IO, deferring to other routines, is relegated 
>>> to the realm of “magic”.   Is Python reference counting and garbage 
>>> collection “magic”?How can I be sure that my program is only declaring 
>>> memory, only as much as I expect, and then freeing it only when I 
>>> absolutely say so, the way async advocates seem to be about IO?   Why would 
>>> a high level scripting language enforce this level of low-level bookkeeping 
>>> of IO calls as explicit, when it is 100% predictable and automatable ?
>> 
>> The difference is that in the many years of Python programming I’ve had to 
>> think about garbage collection all of once. I’ve yet to write a non trivial 
>> implicit IO application where the implicit context switch didn’t break 
>> something and I had to think about adding explicit locks around things.
> 
> that’s your personal experience, how is that an argument?  I deal with the 
> Python garbage collector, memory management, etc. *all the time*.   I have a 
> whole test suite dedicated to ensuring that SQLAlchemy constructs tear 
> themselves down appropriately in the face of gc and such: 
> https://github.com/zzzeek/sqlalchemy/blob/master/test/aaa_profiling/test_memusage.py
>  .   This is the product of tons of different observed and reported issues 
> about this operation or that operation forming constructs that would take up 
> too much memory, wouldn’t be garbage collected when expected, etc.  
> 
> Yet somehow I still value very much the work that implicit GC does for me and 
> I understand well when it is going to happen.  I don’t decide that that whole 
> world should be forced to never have GC again.  I’m sure you wouldn’t be 
> happy if I got Guido to drop garbage collection from Python because I showed 
> how sometimes it makes my life more difficult, therefore we should all be 
> managing memory explicitly.

Eh, Maybe you need to do that, that’s fine I suppose. Though the option isn’t 
between something with a very clear failure condition and something with a 
“weird things start happening” error condition. It’s between “weird things 
start happening” and “weird things start happening, just they are less likely 
to happen less”. Implicit context switches introduce a new harder to debug 
failure mode over blocking code that explicit context switches do not.

> 
> I’m sure my agenda here is pretty transparent.  If explicit async becomes the 
> only way to go, SQLAlchemy basically closes down.   I’d have to rewrite it 
> completely (after waiting for all the DBAPIs that don’t exist to be written, 
> why doesn’t anyone ever seem to be concerned about that?) , and it would run 
> much less efficiently due to the massive amount of additional function call 
> overhead incurred by the explicit coroutines.   It’s a pointless amount of 
> verbosity within a scripting language.  

I don’t really take performance issues that seriously for CPython. If you care 
about performance you should be using PyPy. I like that argument though because 
the same argument is used against the GCs which you like to use as an example 
too.

The verbosity isn’t really pointless, you have to be verbose in either 
situation, either explicit locks or explicit context switches. If you don’t 
have explicit locks you just have buggy software instead.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Mike Bayer

> On Nov 23, 2014, at 7:30 PM, Donald Stufft  wrote:
> 
> 
>> On Nov 23, 2014, at 7:21 PM, Mike Bayer  wrote:
>> 
>> Given that, I’ve yet to understand why a system that implicitly defers CPU 
>> use when a routine encounters IO, deferring to other routines, is relegated 
>> to the realm of “magic”.   Is Python reference counting and garbage 
>> collection “magic”?How can I be sure that my program is only declaring 
>> memory, only as much as I expect, and then freeing it only when I absolutely 
>> say so, the way async advocates seem to be about IO?   Why would a high 
>> level scripting language enforce this level of low-level bookkeeping of IO 
>> calls as explicit, when it is 100% predictable and automatable ?
> 
> The difference is that in the many years of Python programming I’ve had to 
> think about garbage collection all of once. I’ve yet to write a non trivial 
> implicit IO application where the implicit context switch didn’t break 
> something and I had to think about adding explicit locks around things.

that’s your personal experience, how is that an argument?  I deal with the 
Python garbage collector, memory management, etc. *all the time*.   I have a 
whole test suite dedicated to ensuring that SQLAlchemy constructs tear 
themselves down appropriately in the face of gc and such: 
https://github.com/zzzeek/sqlalchemy/blob/master/test/aaa_profiling/test_memusage.py
 .   This is the product of tons of different observed and reported issues 
about this operation or that operation forming constructs that would take up 
too much memory, wouldn’t be garbage collected when expected, etc.  

Yet somehow I still value very much the work that implicit GC does for me and I 
understand well when it is going to happen.  I don’t decide that that whole 
world should be forced to never have GC again.  I’m sure you wouldn’t be happy 
if I got Guido to drop garbage collection from Python because I showed how 
sometimes it makes my life more difficult, therefore we should all be managing 
memory explicitly.

I’m sure my agenda here is pretty transparent.  If explicit async becomes the 
only way to go, SQLAlchemy basically closes down.   I’d have to rewrite it 
completely (after waiting for all the DBAPIs that don’t exist to be written, 
why doesn’t anyone ever seem to be concerned about that?) , and it would run 
much less efficiently due to the massive amount of additional function call 
overhead incurred by the explicit coroutines.   It’s a pointless amount of 
verbosity within a scripting language.  

> 
> Really that’s what it comes down to. Either you need to enable explicit 
> context switches (via callbacks or yielding, or whatever) or you need to add 
> explicit locks. Neither solution allows you to pretend that context switching 
> isn’t going to happen nor prevents you from having to deal with it. The 
> reason I prefer explicit async is because the failure mode is better (if I 
> forget to yield I don’t get the actual value so my thing blows up in 
> development) and it ironically works more like blocking programming because I 
> won’t get an implicit context switch in the middle of a function. Compare 
> that to the implicit async where the failure mode is that at runtime 
> something weird happens.
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Donald Stufft

> On Nov 23, 2014, at 7:29 PM, Mike Bayer  wrote:
> 
>> 
>> Glyph wrote a good post that mirrors my opinions on implicit vs explicit
>> here: https://glyph.twistedmatrix.com/2014/02/unyielding.html.
> 
> this is the post that most makes me think about the garbage collector 
> analogy, re: “gevent works perfectly fine, but sorry, it just isn’t 
> “correct”.  It should be feared! ”.   Unfortunately Glyph has orders of 
> magnitude more intellectual capabilities than I do, so I am ultimately not an 
> effective advocate for my position; hence I have my fallback career as a 
> cheese maker lined up for when the async agenda finally takes over all 
> computer programming.

Like I said, I’ve had to think about garbage collecting all of once in my 
entire Python career. Implicit might be theoretically nicer but until it can 
actually live up to the “gets out of my way-ness” of the abstractions you’re 
citing I’d personally much rather pass on it.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Donald Stufft

> On Nov 23, 2014, at 7:21 PM, Mike Bayer  wrote:
> 
> Given that, I’ve yet to understand why a system that implicitly defers CPU 
> use when a routine encounters IO, deferring to other routines, is relegated 
> to the realm of “magic”.   Is Python reference counting and garbage 
> collection “magic”?How can I be sure that my program is only declaring 
> memory, only as much as I expect, and then freeing it only when I absolutely 
> say so, the way async advocates seem to be about IO?   Why would a high level 
> scripting language enforce this level of low-level bookkeeping of IO calls as 
> explicit, when it is 100% predictable and automatable ?

The difference is that in the many years of Python programming I’ve had to 
think about garbage collection all of once. I’ve yet to write a non trivial 
implicit IO application where the implicit context switch didn’t break 
something and I had to think about adding explicit locks around things.

Really that’s what it comes down to. Either you need to enable explicit context 
switches (via callbacks or yielding, or whatever) or you need to add explicit 
locks. Neither solution allows you to pretend that context switching isn’t 
going to happen nor prevents you from having to deal with it. The reason I 
prefer explicit async is because the failure mode is better (if I forget to 
yield I don’t get the actual value so my thing blows up in development) and it 
ironically works more like blocking programming because I won’t get an implicit 
context switch in the middle of a function. Compare that to the implicit async 
where the failure mode is that at runtime something weird happens.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Mike Bayer

> On Nov 23, 2014, at 6:35 PM, Donald Stufft  wrote:
> 
> 
> For whatever it’s worth, I find explicit async io to be _way_ easier to
> understand for the same reason I find threaded code to be a rats nest.

web applications aren’t explicitly “threaded”.   You get a request, load some 
data, manipulate it, and return a response.   There are no threads to reason 
about, nothing is explicitly shared in any way.

> 
> The co-routine style of asyncio (or Twisted’s inlineCallbacks) solves
> almost all of the problems that I think most people have with explicit
> asyncio (namely the callback hell) while still getting the benefits.

coroutines are still “inside out” and still have all the issues discussed in 
http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html
 which I also refer to in 
http://stackoverflow.com/questions/16491564/how-to-make-sqlalchemy-in-tornado-to-be-async/16503103#16503103.

> 
> Glyph wrote a good post that mirrors my opinions on implicit vs explicit
> here: https://glyph.twistedmatrix.com/2014/02/unyielding.html.

this is the post that most makes me think about the garbage collector analogy, 
re: “gevent works perfectly fine, but sorry, it just isn’t “correct”.  It 
should be feared! ”.   Unfortunately Glyph has orders of magnitude more 
intellectual capabilities than I do, so I am ultimately not an effective 
advocate for my position; hence I have my fallback career as a cheese maker 
lined up for when the async agenda finally takes over all computer programming.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Mike Bayer

> On Nov 23, 2014, at 6:13 PM, Robert Collins  wrote:
> 
> 
> So - the technical bits of the plan sound fine.

> 
> On WSGI - if we're in an asyncio world,

*looks around*, we are?   when did that happen?Assuming we’re talking 
explicit async. Rewriting all our code as verbose, “inside out” code, vast 
library incompatibility, and…some notion of “correctness” that somehow is 
supposed to be appropriate for a high level scripting language and can’t be 
achieved though simple, automated means such as gevent.

> I don't think WSGI has any
> relevance today -

if you want async + wsgi, use gevent.wsgi.   It is of course not explicit 
async but if the whole world decides that we all have to explicitly turn all of 
our code inside out to appease the concept of “oh no, IO IS ABOUT TO HAPPEN! 
ARE WE READY! ”,  I am definitely quitting programming to become a cheese 
maker.   If you’re writing some high performance TCP server thing, fine 
(…but... why are you writing a high performance server in Python and not 
something more appropriate like Go?).  If we’re dealing with message queues as 
I know this thread is about, fine.

But if you’re writing “receive a request, load some data, change some of it 
around, store it again, and return a result”, I don’t see why this has to be 
intentionally complicated.   Use implicit async that can interact with the 
explicit async messaging stuff appropriately.   That’s purportedly one of the 
goals of asyncIO (which Nick Coghlan had to lobby pretty hard for; source: 
http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html#gevent-and-pep-3156
  ).

> it has no async programming model.

neither do a *lot* of things, including all traditional ORMs.I’m fine with 
Ceilometer dropping SQLAlchemy support as they prefer MongoDB and their 
relational database code is fairly wanting.   Per 
http://aiogreen.readthedocs.org/openstack.html, I’m not sure how else they will 
drop eventlet support throughout the entire app.   


> While is has
> incremental apis and supports generators, thats not close enough to
> the same thing: so we're going to have to port our glue code to
> whatever container we end up with. As you know I'm pushing on a revamp
> of WSGI right now, and I'd be delighted to help put together a
> WSGI-for-asyncio PEP, but I think its best thought of as a separate
> thing to WSGI per se.

given the push for explicit async, seems like lots of effort will need to be 
spent on this. 

> It might be a profile of WSGI2 though, since
> there is quite some interest in truely async models.
> 
> However I've a bigger picture concern. OpenStack only relatively
> recently switched away from an explicit async model (Twisted) to
> eventlet.

hooray.   efficient database access for explicit async code would be impossible 
otherwise as there are no explicit async APIs to MySQL, and only one for 
Postgresql which is extremely difficult to support.

> 
> I'm worried that this is switching back to something we switched away
> from (in that Twisted and asyncio have much more in common than either
> Twisted and eventlet w/magic, or asyncio and eventlet w/magic).

In the C programming world, when you want to do something as simple as create a 
list of records, it’s not so simple: you have to explicitly declare memory 
using malloc(), and organize your program skillfully and carefully such that 
this memory is ultimately freed using free().   It’s tedious and error prone.   
So in the scripting language world, these tedious, low level and entirely 
predictable steps are automated away for us; memory is declared automatically, 
and freed automatically.  Even reference cycles are cleaned out for us without 
us even being aware.  This is why we use “scripting languages” - they are 
intentionally automated to speed the pace of development and produce code that 
is far less verbose than low-level C code and much less prone to low-level 
errors, albeit considerably less efficient.   It’s the payoff we make; 
predictable bookkeeping of the system’s resources are automated away.
There’s a price; the Python interpreter uses a ton of memory and tends to not 
free memory once large chunks of it have been used by the application.   The 
implicit allocation and freeing of memory has a huge tradeoff, in that the 
Python interpreter uses lots of memory pretty quickly.  However, this tradeoff, 
Python’s clearly inefficient use of memory because it’s automating the 
management of it away for us, is one which nobody seems to mind at all.   

But when it comes to IO, the implicit allocation of IO and deferment of 
execution done by gevent has no side effect anywhere near as harmful as the 
Python interpreter’s huge memory consumption.  Yet we are so afraid of it, so 
frightened that our code…written in a *high level scripting language*, might 
not be “correct”.  We might not know that IO is about to happen!   How is this 
different from the much more tangible and day-

Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Robert Collins
On 24 November 2014 at 12:30, Monty Taylor  wrote:

> I'm not going to comment on the pros and cons - I think we all know I'm
> a fan of threads. But I have been around a while, so - for those who
> haven't been:

FWIW we have *threads* today as a programming model. The
implementation is green, but the concepts we work with in the code are
threads, threadpools and so forth.

eventlet is an optimisation around some [minor] inefficiencies in
Python, but it doesn't change the programming model - see dstuffts
excellent link for details on that.

I too will hold off from commentting on the pros and cons today; this
isn't about good or bad, its about making sure this revisiting of a
huge discussion and effort gets the right visibility.



> The main 'winning' answer came down to twisted being very opaque for new
> devs - while it's very powerful for experienced devs, we decided to opt
> for eventlet which does not scare new devs with a completely different
> programming model. (reactors and deferreds and whatnot)
>
> Now, I wouldn't say we _just_ ported from Twisted, I think we finished
> that work about 4 years ago. :)

Nova managed it in Jan 2011, so 3.5 mumblemumble. Near enough to 'just' :)

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Robert Collins
On 24 November 2014 at 12:35, Donald Stufft  wrote:
>
> For whatever it’s worth, I find explicit async io to be _way_ easier to
> understand for the same reason I find threaded code to be a rats nest.
>
> The co-routine style of asyncio (or Twisted’s inlineCallbacks) solves
> almost all of the problems that I think most people have with explicit
> asyncio (namely the callback hell) while still getting the benefits.

Sure. Note that OpenStack *was* using inlineCallbacks.

> Glyph wrote a good post that mirrors my opinions on implicit vs explicit
> here: https://glyph.twistedmatrix.com/2014/02/unyielding.html.

That is, we chose
"
4. and finally, implicit coroutines: Java’s “green threads”, Twisted’s
Corotwine, eventlet, gevent, where any function may switch the entire
stack of the current thread of control by calling a function which
suspends it.
"

- the option that Glyph (and I too) would say to never ever choose.

My concern isn't that asyncio is bad - its not. Its that we spent an
awful lot of time and effort rewriting nova etc to be 'option 4', and
we've no reason to believe that whatever it was that made that not
work /for us/ has been fixed.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Donald Stufft

> On Nov 23, 2014, at 6:30 PM, Monty Taylor  wrote:
> 
> On 11/23/2014 06:13 PM, Robert Collins wrote:
>> On 24 November 2014 at 11:01, victor stinner
>>  wrote:
>>> Hi,
>>> 
>>> I'm happy to announce you that I just finished the last piece of the puzzle 
>>> to add support for trollius coroutines in Oslo Messaging! See my two 
>>> changes:
>>> 
>>> * Add a new aiogreen executor:
>>>  https://review.openstack.org/#/c/136653/
>>> * Add an optional executor callback to dispatcher:
>>>  https://review.openstack.org/#/c/136652/
>>> 
>>> Related projects:
>>> 
>>> * asyncio is an event loop which is now part of Python 3.4:
>>>  http://docs.python.org/dev/library/asyncio.html
>>> * trollius is the port of the new asyncio module to Python 2:
>>>  http://trollius.readthedocs.org/
>>> * aiogreen implements the asyncio API on top of eventlet:
>>>  http://aiogreen.readthedocs.org/
>>> 
>>> For the long story and the full history of my work on asyncio in OpenStack 
>>> since one year, read:
>>> http://aiogreen.readthedocs.org/openstack.html
>>> 
>>> The last piece of the puzzle is the new aiogreen project that I released a 
>>> few days ago. aiogreen is well integrated and fully compatible with 
>>> eventlet, it can be used in OpenStack without having to modify code. It is 
>>> almost fully based on trollius, it just has a small glue to reuse eventlet 
>>> event loop (get read/write notifications of file descriptors).
>>> 
>>> In the past, I tried to use the greenio project, which also implements the 
>>> asyncio API, but it didn't fit well with eventlet. That's why I wrote a new 
>>> project.
>>> 
>>> Supporting trollius coroutines in Oslo Messaging is just the first part of 
>>> the global project. Here is my full plan to replace eventlet with asyncio.
>> 
>> ...
>> 
>> So - the technical bits of the plan sound fine.
>> 
>> On WSGI - if we're in an asyncio world, I don't think WSGI has any
>> relevance today - it has no async programming model. While is has
>> incremental apis and supports generators, thats not close enough to
>> the same thing: so we're going to have to port our glue code to
>> whatever container we end up with. As you know I'm pushing on a revamp
>> of WSGI right now, and I'd be delighted to help put together a
>> WSGI-for-asyncio PEP, but I think its best thought of as a separate
>> thing to WSGI per se. It might be a profile of WSGI2 though, since
>> there is quite some interest in truely async models.
>> 
>> However I've a bigger picture concern. OpenStack only relatively
>> recently switched away from an explicit async model (Twisted) to
>> eventlet.
>> 
>> I'm worried that this is switching back to something we switched away
>> from (in that Twisted and asyncio have much more in common than either
>> Twisted and eventlet w/magic, or asyncio and eventlet w/magic).
>> 
>> If Twisted was unacceptable to the community, what makes asyncio
>> acceptable? [Note, I don't really understand why Twisted was moved
>> away from, since our problem domain is such a great fit for reactor
>> style programming - lots of networking, lots of calling of processes
>> that may take some time to complete their work, and occasional DB
>> calls [which are equally problematic in eventlet and in
>> asyncio/Twisted]. So I'm not arguing against the move, I'm just
>> concerned that doing it without addressing whatever the underlying
>> thing was, will fail - and I'm also concerned that it will surprise
>> folk - since there doesn't seem to be a cross project blueprint
>> talking about this fairly fundamental shift in programming model.
> 
> I'm not going to comment on the pros and cons - I think we all know I'm
> a fan of threads. But I have been around a while, so - for those who
> haven't been:
> 
> When we started the project, nova used twisted and swift used eventlet.
> As we've consistently endeavored to not have multiple frameworks, we
> entered in to the project's first big flame war:
> 
> "twisted vs. eventlet"
> 
> It was _real_ fun, I promise. But a the heart was a question of whether
> we were going to rewrite swift in twisted or rewrite nova in eventlet.
> 
> The main 'winning' answer came down to twisted being very opaque for new
> devs - while it's very powerful for experienced devs, we decided to opt
> for eventlet which does not scare new devs with a completely different
> programming model. (reactors and deferreds and whatnot)
> 
> Now, I wouldn't say we _just_ ported from Twisted, I think we finished
> that work about 4 years ago. :)
> 

For whatever it’s worth, I find explicit async io to be _way_ easier to
understand for the same reason I find threaded code to be a rats nest.

The co-routine style of asyncio (or Twisted’s inlineCallbacks) solves
almost all of the problems that I think most people have with explicit
asyncio (namely the callback hell) while still getting the benefits.

Glyph wrote a good post that mirrors my opinions on implicit vs explicit
here: https://glyph.twistedmatrix.com/2014/02/u

Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Monty Taylor
On 11/23/2014 06:13 PM, Robert Collins wrote:
> On 24 November 2014 at 11:01, victor stinner
>  wrote:
>> Hi,
>>
>> I'm happy to announce you that I just finished the last piece of the puzzle 
>> to add support for trollius coroutines in Oslo Messaging! See my two changes:
>>
>> * Add a new aiogreen executor:
>>   https://review.openstack.org/#/c/136653/
>> * Add an optional executor callback to dispatcher:
>>   https://review.openstack.org/#/c/136652/
>>
>> Related projects:
>>
>> * asyncio is an event loop which is now part of Python 3.4:
>>   http://docs.python.org/dev/library/asyncio.html
>> * trollius is the port of the new asyncio module to Python 2:
>>   http://trollius.readthedocs.org/
>> * aiogreen implements the asyncio API on top of eventlet:
>>   http://aiogreen.readthedocs.org/
>>
>> For the long story and the full history of my work on asyncio in OpenStack 
>> since one year, read:
>> http://aiogreen.readthedocs.org/openstack.html
>>
>> The last piece of the puzzle is the new aiogreen project that I released a 
>> few days ago. aiogreen is well integrated and fully compatible with 
>> eventlet, it can be used in OpenStack without having to modify code. It is 
>> almost fully based on trollius, it just has a small glue to reuse eventlet 
>> event loop (get read/write notifications of file descriptors).
>>
>> In the past, I tried to use the greenio project, which also implements the 
>> asyncio API, but it didn't fit well with eventlet. That's why I wrote a new 
>> project.
>>
>> Supporting trollius coroutines in Oslo Messaging is just the first part of 
>> the global project. Here is my full plan to replace eventlet with asyncio.
> 
> ...
> 
> So - the technical bits of the plan sound fine.
> 
> On WSGI - if we're in an asyncio world, I don't think WSGI has any
> relevance today - it has no async programming model. While is has
> incremental apis and supports generators, thats not close enough to
> the same thing: so we're going to have to port our glue code to
> whatever container we end up with. As you know I'm pushing on a revamp
> of WSGI right now, and I'd be delighted to help put together a
> WSGI-for-asyncio PEP, but I think its best thought of as a separate
> thing to WSGI per se. It might be a profile of WSGI2 though, since
> there is quite some interest in truely async models.
> 
> However I've a bigger picture concern. OpenStack only relatively
> recently switched away from an explicit async model (Twisted) to
> eventlet.
> 
> I'm worried that this is switching back to something we switched away
> from (in that Twisted and asyncio have much more in common than either
> Twisted and eventlet w/magic, or asyncio and eventlet w/magic).
> 
> If Twisted was unacceptable to the community, what makes asyncio
> acceptable? [Note, I don't really understand why Twisted was moved
> away from, since our problem domain is such a great fit for reactor
> style programming - lots of networking, lots of calling of processes
> that may take some time to complete their work, and occasional DB
> calls [which are equally problematic in eventlet and in
> asyncio/Twisted]. So I'm not arguing against the move, I'm just
> concerned that doing it without addressing whatever the underlying
> thing was, will fail - and I'm also concerned that it will surprise
> folk - since there doesn't seem to be a cross project blueprint
> talking about this fairly fundamental shift in programming model.

I'm not going to comment on the pros and cons - I think we all know I'm
a fan of threads. But I have been around a while, so - for those who
haven't been:

When we started the project, nova used twisted and swift used eventlet.
As we've consistently endeavored to not have multiple frameworks, we
entered in to the project's first big flame war:

"twisted vs. eventlet"

It was _real_ fun, I promise. But a the heart was a question of whether
we were going to rewrite swift in twisted or rewrite nova in eventlet.

The main 'winning' answer came down to twisted being very opaque for new
devs - while it's very powerful for experienced devs, we decided to opt
for eventlet which does not scare new devs with a completely different
programming model. (reactors and deferreds and whatnot)

Now, I wouldn't say we _just_ ported from Twisted, I think we finished
that work about 4 years ago. :)

Monty

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread Robert Collins
On 24 November 2014 at 11:01, victor stinner
 wrote:
> Hi,
>
> I'm happy to announce you that I just finished the last piece of the puzzle 
> to add support for trollius coroutines in Oslo Messaging! See my two changes:
>
> * Add a new aiogreen executor:
>   https://review.openstack.org/#/c/136653/
> * Add an optional executor callback to dispatcher:
>   https://review.openstack.org/#/c/136652/
>
> Related projects:
>
> * asyncio is an event loop which is now part of Python 3.4:
>   http://docs.python.org/dev/library/asyncio.html
> * trollius is the port of the new asyncio module to Python 2:
>   http://trollius.readthedocs.org/
> * aiogreen implements the asyncio API on top of eventlet:
>   http://aiogreen.readthedocs.org/
>
> For the long story and the full history of my work on asyncio in OpenStack 
> since one year, read:
> http://aiogreen.readthedocs.org/openstack.html
>
> The last piece of the puzzle is the new aiogreen project that I released a 
> few days ago. aiogreen is well integrated and fully compatible with eventlet, 
> it can be used in OpenStack without having to modify code. It is almost fully 
> based on trollius, it just has a small glue to reuse eventlet event loop (get 
> read/write notifications of file descriptors).
>
> In the past, I tried to use the greenio project, which also implements the 
> asyncio API, but it didn't fit well with eventlet. That's why I wrote a new 
> project.
>
> Supporting trollius coroutines in Oslo Messaging is just the first part of 
> the global project. Here is my full plan to replace eventlet with asyncio.

...

So - the technical bits of the plan sound fine.

On WSGI - if we're in an asyncio world, I don't think WSGI has any
relevance today - it has no async programming model. While is has
incremental apis and supports generators, thats not close enough to
the same thing: so we're going to have to port our glue code to
whatever container we end up with. As you know I'm pushing on a revamp
of WSGI right now, and I'd be delighted to help put together a
WSGI-for-asyncio PEP, but I think its best thought of as a separate
thing to WSGI per se. It might be a profile of WSGI2 though, since
there is quite some interest in truely async models.

However I've a bigger picture concern. OpenStack only relatively
recently switched away from an explicit async model (Twisted) to
eventlet.

I'm worried that this is switching back to something we switched away
from (in that Twisted and asyncio have much more in common than either
Twisted and eventlet w/magic, or asyncio and eventlet w/magic).

If Twisted was unacceptable to the community, what makes asyncio
acceptable? [Note, I don't really understand why Twisted was moved
away from, since our problem domain is such a great fit for reactor
style programming - lots of networking, lots of calling of processes
that may take some time to complete their work, and occasional DB
calls [which are equally problematic in eventlet and in
asyncio/Twisted]. So I'm not arguing against the move, I'm just
concerned that doing it without addressing whatever the underlying
thing was, will fail - and I'm also concerned that it will surprise
folk - since there doesn't seem to be a cross project blueprint
talking about this fairly fundamental shift in programming model.

-Rob

-- 
Robert Collins 
Distinguished Technologistste
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo] Add a new aiogreen executor for Oslo Messaging

2014-11-23 Thread victor stinner
Hi,

I'm happy to announce you that I just finished the last piece of the puzzle to 
add support for trollius coroutines in Oslo Messaging! See my two changes:

* Add a new aiogreen executor:
  https://review.openstack.org/#/c/136653/
* Add an optional executor callback to dispatcher:
  https://review.openstack.org/#/c/136652/

Related projects:

* asyncio is an event loop which is now part of Python 3.4:
  http://docs.python.org/dev/library/asyncio.html
* trollius is the port of the new asyncio module to Python 2:
  http://trollius.readthedocs.org/
* aiogreen implements the asyncio API on top of eventlet:
  http://aiogreen.readthedocs.org/

For the long story and the full history of my work on asyncio in OpenStack 
since one year, read:
http://aiogreen.readthedocs.org/openstack.html

The last piece of the puzzle is the new aiogreen project that I released a few 
days ago. aiogreen is well integrated and fully compatible with eventlet, it 
can be used in OpenStack without having to modify code. It is almost fully 
based on trollius, it just has a small glue to reuse eventlet event loop (get 
read/write notifications of file descriptors).

In the past, I tried to use the greenio project, which also implements the 
asyncio API, but it didn't fit well with eventlet. That's why I wrote a new 
project.

Supporting trollius coroutines in Oslo Messaging is just the first part of the 
global project. Here is my full plan to replace eventlet with asyncio.


First part (in progress): add support for trollius coroutines
-

Prepare OpenStack (Oslo Messaging) to support trollius coroutines using
``yield``: explicit asynchronous programming. Eventlet is still supported,
used by default, and applications and libraries don't need to be modified at
this point.

Already done:

* Write the trollius project: port asyncio to Python 2
* Stabilize trollius API
* Add trollius dependency to OpenStack
* Write the aiogreen project to provide the asyncio API on top of eventlet

To do:

* Stabilize aiogreen API
* Add aiogreen dependency to OpenStack
* Write an aiogreen executor for Oslo Messaging: rewrite greenio executor
  to replace greenio with aiogreen


Second part (to do): rewrite code as trollius coroutines


Switch from implicit asynchronous programming (eventlet using greenthreads) to
explicit asynchronous programming (trollius coroutines using ``yield``). Need
to modify OpenStack Common Libraries and applications. Modifications can be
done step by step, the switch will take more than 6 months.

The first application candidate is Ceilometer. The Ceilometer project is young,
developers are aware of eventlet issues and like Python 3, and Ceilometer don't
rely so much on asynchronous programming: most time is spent into waiting the
database anyway.

The goal is to port Ceilometer to explicit asynchronous programming during the
cycle of OpenStack Kilo.

Some applications may continue to use implicit asynchronous programming. For
example, nova is probably the most complex part beacuse it is and old project
with a lot of legacy code, it has many drivers and the code base is large.

To do:

* Ceilometer: add trollius dependency and set the trollius event loop policy to
  aiogreen
* Ceilometer: change Oslo Messaging executor from "eventlet" to "aiogreen"
* Redesign the service class of Oslo Incubator to support aiogreen and/or
  trollius.  Currently, the class is designed for eventlet. The service class
  is instanciated before forking, which requires hacks on eventlet to update
  file descriptors.
* In Ceilometer and its OpenStack depedencencies: add new functions which
  are written with explicit asynchronous programming in mind (ex: trollius
  coroutines written with ``yield``).
* Rewrite Ceilometer endpoints (RPC methods) as trollius coroutines.

Questions:

* What about WSGI? aiohttp is not compatible with trollius yet.
* The quantity of code which need to be ported to asynchronous programming is
  unknown right now.
* We should be prepared to see deadlocks. OpenStack was designed for eventlet
  which implicitly switch on blocking operations. Critical sections may not be
  protected with locks, or not the right kind of lock.
* For performances, blocking operations can be executed in threads. OpenStack
  code is probably not thread-safe, which means new kinds of race conditions.
  But the code executed in threads will be explicitly scheduled to be executed
  in a thread (with ``loop.run_in_executor()``), so regressions can be easily
  identified.
* This part will take a lot of time. We may need to split it into subparts
  to have milestones, which is more attractive for developers.


Last part (to do): drop eventlet


Replace aiogreen event loop with trollius event loop, drop aiogreen and drop
eventlet at the end.

This change will be done on applications one by one. This is no need t