> On Nov 23, 2014, at 6:13 PM, Robert Collins <[email protected]> wrote:
> 
> 
> So - the technical bits of the plan sound fine.

> 
> On WSGI - if we're in an asyncio world,

*looks around*, we are?   when did that happen?    Assuming we’re talking 
explicit async.     Rewriting all our code as verbose, “inside out” code, vast 
library incompatibility, and…some notion of “correctness” that somehow is 
supposed to be appropriate for a high level scripting language and can’t be 
achieved though simple, automated means such as gevent.

> I don't think WSGI has any
> relevance today -

if you want async + wsgi, use gevent.wsgi.       It is of course not explicit 
async but if the whole world decides that we all have to explicitly turn all of 
our code inside out to appease the concept of “oh no, IO IS ABOUT TO HAPPEN! 
ARE WE READY! ”,  I am definitely quitting programming to become a cheese 
maker.   If you’re writing some high performance TCP server thing, fine 
(…but... why are you writing a high performance server in Python and not 
something more appropriate like Go?).  If we’re dealing with message queues as 
I know this thread is about, fine.

But if you’re writing “receive a request, load some data, change some of it 
around, store it again, and return a result”, I don’t see why this has to be 
intentionally complicated.   Use implicit async that can interact with the 
explicit async messaging stuff appropriately.   That’s purportedly one of the 
goals of asyncIO (which Nick Coghlan had to lobby pretty hard for; source: 
http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html#gevent-and-pep-3156
  ).

> it has no async programming model.

neither do a *lot* of things, including all traditional ORMs.    I’m fine with 
Ceilometer dropping SQLAlchemy support as they prefer MongoDB and their 
relational database code is fairly wanting.   Per 
http://aiogreen.readthedocs.org/openstack.html, I’m not sure how else they will 
drop eventlet support throughout the entire app.   


> While is has
> incremental apis and supports generators, thats not close enough to
> the same thing: so we're going to have to port our glue code to
> whatever container we end up with. As you know I'm pushing on a revamp
> of WSGI right now, and I'd be delighted to help put together a
> WSGI-for-asyncio PEP, but I think its best thought of as a separate
> thing to WSGI per se.

given the push for explicit async, seems like lots of effort will need to be 
spent on this. 

> It might be a profile of WSGI2 though, since
> there is quite some interest in truely async models.
> 
> However I've a bigger picture concern. OpenStack only relatively
> recently switched away from an explicit async model (Twisted) to
> eventlet.

hooray.   efficient database access for explicit async code would be impossible 
otherwise as there are no explicit async APIs to MySQL, and only one for 
Postgresql which is extremely difficult to support.

> 
> I'm worried that this is switching back to something we switched away
> from (in that Twisted and asyncio have much more in common than either
> Twisted and eventlet w/magic, or asyncio and eventlet w/magic).

In the C programming world, when you want to do something as simple as create a 
list of records, it’s not so simple: you have to explicitly declare memory 
using malloc(), and organize your program skillfully and carefully such that 
this memory is ultimately freed using free().   It’s tedious and error prone.   
So in the scripting language world, these tedious, low level and entirely 
predictable steps are automated away for us; memory is declared automatically, 
and freed automatically.  Even reference cycles are cleaned out for us without 
us even being aware.  This is why we use “scripting languages” - they are 
intentionally automated to speed the pace of development and produce code that 
is far less verbose than low-level C code and much less prone to low-level 
errors, albeit considerably less efficient.   It’s the payoff we make; 
predictable bookkeeping of the system’s resources are automated away.    
There’s a price; the Python interpreter uses a ton of memory and tends to not 
free memory once large chunks of it have been used by the application.   The 
implicit allocation and freeing of memory has a huge tradeoff, in that the 
Python interpreter uses lots of memory pretty quickly.  However, this tradeoff, 
Python’s clearly inefficient use of memory because it’s automating the 
management of it away for us, is one which nobody seems to mind at all.   

But when it comes to IO, the implicit allocation of IO and deferment of 
execution done by gevent has no side effect anywhere near as harmful as the 
Python interpreter’s huge memory consumption.  Yet we are so afraid of it, so 
frightened that our code…written in a *high level scripting language*, might 
not be “correct”.  We might not know that IO is about to happen!   How is this 
different from the much more tangible and day-to-day issue of, we might not 
know this data structure is taking up a crapload of memory, and taking a ton of 
time to allocate and free it?    

Given that, I’ve yet to understand why a system that implicitly defers CPU use 
when a routine encounters IO, deferring to other routines, is relegated to the 
realm of “magic”.   Is Python reference counting and garbage collection 
“magic”?    How can I be sure that my program is only declaring memory, only as 
much as I expect, and then freeing it only when I absolutely say so, the way 
async advocates seem to be about IO?   Why would a high level scripting 
language enforce this level of low-level bookkeeping of IO calls as explicit, 
when it is 100% predictable and automatable ?

I often wonder if the appeal of explicit async IO is partially driven by 
misunderstandings of how computers work.  Here’s a reddit commenter just today, 
who thinks that because Flask doesn’t use explicit async, it therefore “cannot 
use multiple cores” (clearly incorrect) and therefore “database access will be 
4-8 times slower” 
http://www.reddit.com/r/Python/comments/2n4tes/does_sqlalchemy_scale_well_with_increased_web/cmalj3q.
    Everytime I look for arguments in favor of explicit async, this is what I 
find - I’ve yet to find an actual argument other than….”gevent is terrible 
magic!!”,  what is so terrible about the “incorrectness” of using a tools like 
gevent to manage IO/task deferment automatically, and why other forms of 
“magic!” like automatic garbage collection are so taken for granted, when their 
real-world effects are actually much worse.  


> If Twisted was unacceptable to the community, what makes asyncio
> acceptable?

it’s new and modern, and is pushed in a PEP that Guido is very interested in.   
The rumor mill also grumbles that the sudden popularity of node.js was a factor 
in this change of direction.    As long as it integrates with code that is 
fundamentally reliant upon implicit IO (e.g. gevent / eventlet / other tie in), 
I am fine with it.

> [Note, I don't really understand why Twisted was moved
> away from, since our problem domain is such a great fit for reactor
> style programming - lots of networking, lots of calling of processes
> that may take some time to complete their work, and occasional DB
> calls [which are equally problematic in eventlet and in
> asyncio/Twisted]. So I'm not arguing against the move, I'm just
> concerned that doing it without addressing whatever the underlying
> thing was, will fail - and I'm also concerned that it will surprise
> folk - since there doesn't seem to be a cross project blueprint
> talking about this fairly fundamental shift in programming model.
> 
> -Rob
> 
> -- 
> Robert Collins <[email protected]>
> Distinguished Technologistste
> HP Converged Cloud
> 
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to