Re: [Web-SIG] OT: dotted names (Was: Re: A Python Web Application Package and Format)

2011-04-15 Thread P.J. Eby

At 04:11 PM 4/15/2011 -0400, Fred Drake wrote:

These end users don't really care if the object identified is a class or
function in module, a nested attribute on a class, or anything else, so
long as it does what it's advertised to do.  By not pushing implementation
details into the identifier, the package maintainer is free to change the
implementation in more ways, without creating backward incompatibility.


That would be one advantage of using entry points 
instead.  ;-)  (i.e., the user doesn't specify the object location, 
the package author does.)


Note, however, that one must perform considerably more work to 
resolve a name, when you don't know whether each part of the name is 
a module or an attribute.


Either you have to get an AttributeError first, and then fall back to 
importing, or get an ImportError first, and fall back to getattr.


If the syntax is explicit, OTOH, then you don't have to guess, 
thereby saving lots of work and wasteful exceptions.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] OT: dotted names (Was: Re: A Python Web Application Package and Format)

2011-04-15 Thread P.J. Eby

At 02:02 PM 4/15/2011 -0400, Jim Fulton wrote:

On Fri, Apr 15, 2011 at 1:32 PM, Éric Araujo  wrote:
> As an aside, I wonder why people use dot+colon notation instead of just
> dots to reference callables.  In distutils2 for example we resolve
> dotted names to find command classes, command hooks and compilers.  So
> what's the benefit, marginally easier parsing?

An opportunity of using a colon is that it allows::

   dotted.module.name:expression

where expression may be more than just a name::


  foo.bar:Bar()



The reason setuptools uses ':' is that it allows you to unambiguously 
reference object attributes, e.g.:


   some.module:SomeClass.some_method_or_attribute

(It doesn't allow expressions, just dotted "paths".)

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-20 Thread P.J. Eby

At 09:45 AM 1/20/2011 -0500, James Y Knight wrote:
But I agree, a clarification could be added to the statement '''all 
objects referred to in this specification as "strings" must be of 
type str or StringType''' and '''For values referred to in this 
specification as "bytestrings" [...] the value must be of type bytes 
under Python 3, and str in earlier versions of Python'''. It's not 
100% obvious that it really does mean "type(obj) is str/bytes".


Feel free to write said clarification, check it in, and add glowing 
praise for your efforts to the acknowledgements section.  I will 
indeed appreciate it, so you won't even be lying.   ;-)


Or, you can just send me a patch, and receive slightly less 
praise.  Either way works for me, though.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-12 Thread P.J. Eby

At 02:52 PM 1/12/2011 -0800, Guido van Rossum wrote:

On Wed, Jan 12, 2011 at 2:34 PM, Alice Bevan­McGregor
 wrote:
> On 2011-01-10 13:12:57 -0800, Guido van Rossum said:
>>
>> Ok, now that we've had a week of back and forth about this, let me repeat
>> my "threat". Unless more concerns are brought up in the next 24 hours, can
>> PEP  be accepted? It seems a lot of people are waiting for a decision
>> that enables implementers to go ahead and claim PEP 333[3] compatibility.
>> PEP 444 can take longer.
>
> With the lack of responses, can I assume this has been or will be shortly
> marked as "accepted"?

Yep. Phillip, can you do the honors?


Apparently not -- I went to check it in and found Raymond had already 
marked it "Final".  ;-)


(I'm not clear on whether there's a difference between "Final" and 
"Accepted" heredifference, but I assume that if we find some sort of 
actual error we can still fix it.)



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 feature request - Futures executor

2011-01-11 Thread P.J. Eby

At 09:11 PM 1/10/2011 -0600, Timothy Farrell wrote:

PJ,

You seem to be old-hat at this so I'm looking for a little advise as 
I draft this spec.  It seems a bad idea to me to just say 
environ['wsgi.executor'] will be a wrapped futures executor because 
the api handled by the executor can and likely will change over 
time.  Am I write in thinking that a spec should be more specific in 
saying that the executor object will have "these specific methods" 
and so as future changes, the spec is not in danger of invalidation 
due to the changes?


I'd actually just suggest something like:

future = environ['wsgiorg.future'](func, *args, **kw)

(You need to use the wsgiorg.* namespace for extension proposals like 
this, btw.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Server-side async API implementation sketches

2011-01-10 Thread P.J. Eby

At 05:06 PM 1/9/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-09 09:03:38 -0800, P.J. Eby said:
Hm.  I'm not sure if I like that.  The typical app developer really 
shouldn't be yielding multiple body strings in the first place.


Wait; what?  So you want the app developer to load a 40MB talkcast 
MP3 into memory before sending it?


Statistically speaking, the "typical app" is producing a web page, 
made of HTML and severely limited in size by the short attention span 
of the human user reading it.  ;-)


Obviously, the spec should allow and support streaming.


  You want to completely eliminate the ability to stream an HTML 
page to the client in chunks (e.g.  block, headers + search 
box, search results, advertisements, footer -- the exact thing 
Google does with every search result)?  That sounds like 
artificially restricting application developers, to me.


First, I don't want to eliminate it.   Second, Google is hardly the 
"typical app developer".  If you need the capability, it'll still be there.





In your approach, the above samples have to be rewritten as:
 return app(environ)
[snip]


My code does not use return.  At all.  Only yield.


If you return the calling of a generator, then you pass the original 
generator through to the caller, and it is the equivalent of writing 
a loop in place that iterates over the subgenerator, only without the 
additional complexity of needing to send/throw.



The above middleware pattern works with the sketches I gaveon the 
PEAK wiki, and I've now updated the wiki to include an exampleapp 
and middleware for clarity.


I'll need to re-read the code on your wiki; I find it incredibly 
difficult to grok, however, you can help me out a bit by answering a 
few questions about it: How does middleware trap exceptions raised 
by the application.


With try/except around the "yield app(environ)" call (main app run), 
or with try/except around the "yield body_iter" call (body iterator run).



 (Specifically how does the server pass the buck with 
exceptions?  And how does the exception get to the application to 
bubble out towards the server, through middleware, as it does now?)


All that is in the Coroutine class, which is a generator-based "green 
thread" implementation.


Remember how you were saying that your sketch would benefit from PEP 380?

The Coroutine class is a pure-Python implementation of PEP 380, minus 
the syntactic sugar.  It turns "yield" into "yield from" whenever the 
value you yield is itself a geniter.


So, if you pretend that "yield app(environ)" and "yield body_iter" 
are actually "yield from"s instead, then the mechanics should become clearer.


Coroutine runs a generator by sending or throwing into it.  It then 
takes the result (either a value or an exception) and decides where 
to send that.  If it's an object with send/throw methods, it pushes 
it on the stack, and passes None into it to start it running, thereby 
"calling" the subgenerator.  If it's an exception or a return value 
(e.g. StopIteration(value=None)), it pops the stack and propagates 
the exception or return value to calling generator.


If it's a future or some other object the server cares about, then 
the server can pause the coroutine (by returning 'routine.PAUSE' when 
the coroutine asks it what to do).


Coroutine accepts a trampoline function and a completion callback as 
parameters: the trampoline function inspects a value yielded by a 
generator and then tells the coroutine whether it should PAUSE, CALL, 
RETURN, RESUME, or RAISE in response to that particular 
yield.  RESUME is used for synchronous replies, where the yield 
returns immediately.  RETURN means pop the current generator off the 
stack and return a value to the calling generator.  RAISE raises an 
error immediately in the top-of-stack generator.  CALL pushes a 
geniter on the stack.


IOW, the Coroutine class lets you write servers with just a little 
glue code to tell it how you want the control to flow.  It's actually 
entirely independent of WSGI or any particular WSGI protocol...  I'm 
thinking that I should probably wrap it up into a PyPI package with 
some docs and tests, though I'm not sure when I'd get around to it.


(Heck, it's the sort of thing that probably ought to be in the stdlib 
-- certainly PEP 380 can be implemented in terms of it.)


Anyway, both the sync and async server examples have trampolines that 
detect futures and process them accordingly.  If you yield to a 
future, you get back its result -- either a value or an exception at 
the point where you yielded it.  You don't have to explicitly call 
.result() (in fact, you *can't*); it's already been called before 
control gets back to the place that yielded it.


IOW, in my sketch, yielding to a future

Re: [Web-SIG] Server-side async API implementation sketches

2011-01-10 Thread P.J. Eby

At 04:39 PM 1/9/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-09 09:26:19 -0800, P.J. Eby said:

If wsgi.input offers any synchronous methods...


Regardless of whether or not wsgi.input is implemented in an async 
way, wrap it in a future and eventually get around to yielding 
it.  Problem /solved/.


Not the API problem.  If I'm accustomed to writing synchronous code, 
the async version looks ridiculous.  Also, an existing WSGI web 
framework isn't going to be able to be ported to this API without 
putting it in a future.


My hope was for an API that would be a simple enough translation that 
*everybody* could be persuaded to use it, but having to use futures 
just to write a "normal" application simply isn't going to work for 
the core WSGI API.  As a separate "WSGI-A" profile, sure, it works fine.



If it offers only asynchronous methods, OTOH, then you can't pass 
wsgi.input to any existing libraries (e.g. the cgi module).


Describe to me how a function can be suspended (other than magical 
greenthreads) if it does not yield; if I knew this, maybe I wouldn't 
be so confused.


I'm not sure what you're confused about.  I'm the one who forgot you 
have to read from wsgi.input in a blocking way to write a normal app.  ;-)


(Mainly, because I was so excited about the potential in your 
sketched API, and I got sucked into the process of implementing/improving it.)



I've deviated from your sketch, obviously, and any semblance of 
yielding a 3-tuple.  Stop thinking of my example code as conforming 
to your ideas; it's a new idea, or, worst case, a narrowing of an 
idea into its simplest form.


What I'm trying to point out is that you've missed two important API 
enhancements in my sketch, that make it so that app and middleware 
authors don't have to explicitly manage any generator methods or even 
future methods.



 The mechanics of yielding futures instances allows you to (in your 
server) implement the necessary async code however you wish while 
providing a uniform interface to both sync and async applications 
running on sync and async servers.  In fact, you would be able to 
safely run a sync application on an async server and 
vice-versa.  You can, on an async server:


:: Add a callback to the yielded future to re-schedule the 
application generator.


:: If using greenthreads, just block on future.result() then 
immediately wake up the application generator.


:: Do other things I can't think of because I'm still waking up.


I am not sure why you're reiterating these things.  The sample code I 
posted shows precisely where you'd *do* them in a sync or async 
server.  That's not where the problem lies.



That is not optimum, because now you have an optional API that 
applications who want to be compatible will need to detect and choose between.


It wasn't supposed to be optional, but it's beside the point since 
the presence of a blocking API means the application can block.


The issue might be addressable by having an environment key like, 
'wsgi.canblock' (indicating whether the application is already in a 
separate thread/process), and a piece of middleware that simply 
spawns its child app to a future if wsgi.canblock isn't set.  Then 
people who write blocking applications could use the decorator.




Mostly, though, it seems to me that the need to be able to write 
blocking code does away with most of the benefit of trying to have 
a single API in the first place.


You have artificially created this need, ignoring the semantics of 
using the server-specific executor to detect async-capable requests 
and the yield mechanics I suggested; which happens to be a single, 
coherent API across sync and async servers and applications.


I haven't ignored them.  I'm simply representing the POV of existing 
WSGI apps and frameworks, which currently block, and are unlikely to 
be rewritten so as not to block.  I thought, briefly, that it was 
possible to make an API with a low-enough conceptual overhead to 
allow that porting to occur, and let my enthusiasm carry me away.


I was wrong, though: even the extremely minimalist version isn't 
going to be usable for ported code, which relegates the async version 
to a niche role.


I would note, though, that this is *still* better than my previous 
position, which was that there was no point making an async API *at all*.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Server-side async API implementation sketches

2011-01-09 Thread P.J. Eby

At 08:09 PM 1/9/2011 +0200, Alex Grönholm wrote:
Asynchronous applications may not be ready to send the status line 
as the first thing coming out of the generator.


So?  In the sketches that are the subject of this thread, it doesn't 
have to be the first thing.  If the application yields a future 
first, it will be paused...  and so will the middleware.  When this 
line is executed in the middleware:


status, headers, body = yield app(environ)

...the middleware is paused until the application actually yields its 
response tuple.


Specifically, this yield causes the app iterator to be pushed on the 
Coroutine object's .stack attribute, then iterated.  If the 
application yields a future, the server suspends the whole thing 
until it gets called back, at which point it .send()s the result back 
into the app iterator.


The app iterator then yields its response, which is tagged as a 
return value, so the app is popped off the .stack, and the response 
is sent via .send() into the middleware, which then proceeds as if 
nothing happened in the meantime.  It then yields *its* response, and 
whatever body iterator is given gets put into a second coroutine that 
proceeds similarly.


When the process_response() part of the middleware does a "yield 
body_iter", the body iterator is pushed, and the middleware is paused 
until the body iterator yields a chunk.  If the body yields a future, 
the whole process is suspended and resumed.  The middleware won't be 
resumed until the body yields another chunk, at which point it is 
resumed.  If it yields a chunk of its own, then that's passed up to 
any response-processing middleware further up the stack.


In contrast, middleware based on the 2+body protocol cannot process a 
body without embedding coroutine management into the middleware 
itself.   For example, you can't write a standalone body processor 
function, and reuse it inside of two pieces of middleware, without 
doing a bunch of send()/throw() logic to make it work.



Outside of the application/middleware you mean? I hope there isn't 
any more confusion left about what a future is. The fact is that you 
cannot use synchronous API calls directly from an async app no 
matter what. Some workaround is always necessary.


Which pretty much kills the whole idea as being a single, universal 
WSGI protocol, since most people don't care about async.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Server-side async API implementation sketches

2011-01-09 Thread P.J. Eby

At 04:25 AM 1/9/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-08 13:16:52 -0800, P.J. Eby said:

In the limit case, it appears that any WSGI 1 server could provide 
an (emulated) async WSGI2 implementation, simply by wrapping WSGI2 
apps with a finished version of the decorator in my sketch.
Or, since users could do it themselves, this would mean that WSGI2 
deployment wouldn't be dependent on all server implementers 
immediately turning out their own WSGI2 implementations.


This, if you'll pardon my language, is bloody awesome.  :D  That 
would strongly drive adoption of WSGI2.  Note that adapting a WSGI1 
application to WSGI2 server would likewise be very handy, and I 
suspect, even easier to implement.


I very much doubt that.  You'd need greenlets or a thread with a 
communication channel in order to support WSGI 1 apps that use write() calls.


By the way, I don't really see the point of the new sketches you're 
doing, as they aren't nearly as general as the one I've already done, 
but still have the same fundamental limitation: wsgi.input.


If wsgi.input offers any synchronous methods, then they must be used 
from a future and must somehow raise an error when called from within 
the application -- otherwise it would block, nullifying the point of 
having a generator-based API.


If it offers only asynchronous methods, OTOH, then you can't pass 
wsgi.input to any existing libraries (e.g. the cgi module).


The latter problem is the worse one, because it means that the 
translation of an app between my original WSGI2 API and the current 
sketch is no longer just "replace 'return' with 'yield'".


The only way this would work is if WSGI applications are still 
allowed to be written in a blocking style.  Greenlet-based frameworks 
would have no problem with this, of course, but servers like Twisted 
would still have to run WSGI apps in a worker thread pool, just 
because they *might* block.


If we're okay with this as a limitation, then adding _async method 
variants that return futures might work, and we can proceed from there.


Mostly, though, it seems to me that the need to be able to write 
blocking code does away with most of the benefit of trying to have a 
single API in the first place.  Either everyone ends up putting their 
whole app into a future, or else the server has to accept that the 
app could block... and put it into a future for them.  ;-)


So, the former case will be unacceptable to app developers who don't 
feel a need for async code, and the latter doesn't seem to offer 
anything to the developers of non-blocking servers.


(The exception to these conditions, of course, are greenlet-based 
servers, but they can run WSGI *1* apps in a non-blocking way, and so 
have no need for a new protocol.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Server-side async API implementation sketches

2011-01-09 Thread P.J. Eby

At 06:06 AM 1/9/2011 +0200, Alex Grönholm wrote:
A new feature here is that the application itself yields a (status, 
headers) tuple and then chunks of the body (or futures).


Hm.  I'm not sure if I like that.  The typical app developer really 
shouldn't be yielding multiple body strings in the first place.  I 
much prefer that the canonical example of a WSGI app just return a 
list with a single bytestring -- preferably in a single statement for 
the entire return operation, whether it's a yield or a return.


IOW, I want it to look like the normal way to do thing is to just 
return the whole request at once, and use the additional difficulty 
of creating a second iterator to discourage people writing iterated 
bodies when they should just write everything to a BytesIO and be done with it.


Also, it makes middleware simpler: the last line can just yield the 
result of calling the app, or a modified version, i.e.:


yield app(environ)

or:

s, h, b = app(environ)
# ... modify or replace s, h, b
yield s, h, b

In your approach, the above samples have to be rewritten as:

return app(environ)

or:

result = app(environ)
s, h = yield result
# ... modify or replace s, h
yield s, h

for data in result:
 # modify b as we go
 yield result

Only that last bit doesn't actually work, because you have to be able 
to send future results back *into* the result.  Try actually making 
some code that runs on this protocol and yields to futures during the 
body iteration.


Really, this modified protocol can't work with a full async API the 
way my coroutine-based version does, AND the middleware is much more 
complicated.  In my version, your do-nothing middleware looks like this:



class NullMiddleware(object):
def __init__(self, app):
self.app = app

def __call__(environ):
# ACTION: pre-application environ mangling

s, h, body = yield self.app(environ)

# modify or replace s, h, body here

yield s, h, body


If you want to actually process the body in some way, it looks like:

class NullMiddleware(object):

def __init__(self, app):
self.app = app

def __call__(environ):
# ACTION: pre-application environ mangling

s, h, body = yield self.app(environ)

# modify or replace s, h, body here

yield s, h, self.process(body)

def process(self, body_iter):
while True:
chunk = yield body_iter
if chunk is None:
break
# process/modify chunk here
yield chunk

And that's still a lot simpler than your sketch.

Personally, I would write both of the above as:

def null_middleware(app):

def wrapped(environ):
# ACTION: pre-application environ mangling
s, h, body = yield app(environ)

# modify or replace s, h, body here
yield s, h, process(body)

def process(body_iter):
while True:
chunk = yield body_iter
if chunk is None:
break
# process/modify chunk here
yield chunk

return wrapped

But that's just personal taste.  Even as a class, it's much easier to 
write.  The above middleware pattern works with the sketches I gave 
on the PEAK wiki, and I've now updated the wiki to include an example 
app and middleware for clarity.


Really, the only hole in this approach is dealing with applications 
that block.  The elephant in the room here is that while it's easy to 
write these example applications so they don't block, in practice 
people read files and do database queries and whatnot in their 
requests, and those APIs are generally synchronous.  So, unless they 
somehow fold their entire application into a future, it doesn't work.



I liked the idea of having a separate async_read() method in 
wsgi.input, which would set the underlying socket in nonblocking 
mode and return a future. The event loop would watch the socket and 
read data into a buffer and trigger the callback when the given 
amount of data has been read. Conversely, .read() would set the 
socket in blocking mode. What kinds of problems would this cause?


That you could never *call* the .read() method outside of a future, 
or else you would block the server, thereby obliterating the point of 
having the async API in the first place.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Server-side async API implementation sketches

2011-01-08 Thread P.J. Eby

At 06:15 PM 1/8/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-08 17:22:44 -0800, Alex Grönholm said:

On 2011-01-08 13:16:52 -0800, P.J. Eby said:
I've written the sketches dealing only with PEP 3148 futures, but 
sockets were also proposed, and IMO there should be simple support 
for obtaining data from wsgi.input.
I'm a bit unclear as to how this will work with async. How do you 
propose that an asynchronous application receives the request body?


In my example https://gist.github.com/770743 (which has been 
simplified greatly by P.J. Eby in the "Future- and Generator-Based 
Async Idea" thread) for dealing with wsgi.input, I have:


   future = environ['wsgi.executor'].submit(environ['wsgi.input'].read, 4096)
   yield future

While ugly, if you were doing this, you'd likely:

submit = environ['wsgi.executor'].submit
input_ = environ['wsgi.input']

   future = yield submit(input_.read, 4096)
   data = future.


I don't quite understand the above -- in my sketch, the above would be:

data = yield submit(input._read, 4096)

It looks like your original sketch wants to call .result() on the 
future, whereas in my version, the return value of yielding a future 
is the result (or an error is thrown if the result was an error).


Is there some reason I'm missing, for why you'd want to explicitly 
fetch the result in a separate step?


Meanwhile, thinking about Alex's question, ISTM that if WSGI 2 is 
asynchronous, then the wsgi.input object should probably just have 
read(), readline() etc. methods that simply return (possibly-mock) 
futures.  That's *much* better than having to do all that submit() 
crud just to read data from wsgi.input().


OTOH, if you want to use the cgi module to parse a form POST from the 
input, you're going to need to write an async version of it in that 
case, or else feed the entire operation to an executor...  but then 
the methods would need to be synchronous...  *argh*.


I'm starting to not like this idea at all.  Alex has actually 
pinpointed a very weak spot in the scheme, which is that if 
wsgi.input is synchronous, you destroy the asynchrony, but if it's 
asynchronous, you can't use it with any normal code that operates on a stream.


I don't see any immediate fixes for this problem, so I'll let it 
marinate in the back of my mind for a while.  This might be the 
achilles heel for the whole idea of a low-rent async WSGI.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Server-side async API implementation sketches

2011-01-08 Thread P.J. Eby

At 04:40 AM 1/9/2011 +0200, Alex Grönholm wrote:

09.01.2011 04:15, Alice Bevan­McGregor kirjoitti:
I hope that clearly identifies my idea on the subject. Since async 
servers will /already/ be implementing their own executors, I don't 
see this as too crazy.
-1 on this. Those executors are meant for executing code in a thread 
pool. Mandating a magical socket operation filter here would 
considerably complicate server implementation.


Actually, the *reverse* is true.  If you do it the way Alice 
proposes, my sketches don't get any more complex, because the 
filtering goes in the executor facade or submit function.


Truthfully, I don't really see the point of exposing the map() method 
(which is the only other executor method we'd expose), so it probably 
makes more sense to just offer a 'wsgi.submit' key...  which can be a 
function as follows:


  def submit(callable, *args, **kw):
  ob = getattr(callable, '__self__', None)
  if isinstance(ob, ServerProvidedSocket):  # could be an ABC
   future = MockFuture()
   if callable==ob.read:
   # set up read callback to fire future
   elif callable==ob.write:
   # set up write callback to fire future
   return future
  else:
  return real_executor.submit(callable, *args, **kw)

Granted, this might be a rather long function.  However, since it's 
essentially an optimization, a given server can decide how many 
functions can be shortcut in this way.  The spec may wish to offer a 
guarantee or recommendation for specific methods of certain 
stdlib-provided types (sockets in particular) and wsgi.input.


Personally, I do think it might be *better* to offer extended 
operations on wsgi.input that could be used via yield, e.g. "yield 
input.nb_read()".  But of course then the trampoline code has to 
recognize those values instead of futures.  Either way works, but 
somewhere there is going to be some type-testing (explicit or 
implicit) taking place to determine how to suspend and resume the app.


Note, too, that this complexity also only affects servers that want 
to offer a truly async API.  A synchronous server has no reason to 
pay particular attention to what's in a future, since it can't offer 
any performance improvement.


I do think that this sort of API discussion, though, is the most 
dangerous part of trying to do an async spec.  That is, I don't 
expect that everyone will spontaneously agree on the exact same 
API.  Alice's proposal (simply submitting object methods) has the 
advantage of severely limiting the scope of API discussions.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Server-side async API implementation sketches

2011-01-08 Thread P.J. Eby

As a semi-proof-of-concept, I whipped these up:

  http://peak.telecommunity.com/DevCenter/AsyncWSGISketch

It's an expanded version of my Coroutine concept, updated with sample 
server code for both a synchronous server and an asynchronous 
one.  The synchronous "server" is really just a decorator that wraps 
a WSGI2 async app with futures support, and handles pauses by simply 
waiting for the future to finish.


The asynchronous server is a bit more hand-wavy, in that there are 
some bits (clearly marked) that will be server/framework 
dependent.  However, they should be straightforward for a specialist 
in any given async framework to implement.


What is *most* handwavy at the moment, however, is in the details of 
precisely what one is allowed to "yield to".  I've written the 
sketches dealing only with PEP 3148 futures, but sockets were also 
proposed, and IMO there should be simple support for obtaining data 
from wsgi.input.


However, even this part is pretty easy to extrapolate: both server 
examples just add more type-testing branches in their 
"base_trampoline()" function, copying and modifying the existing 
branches that deal with futures.


The entire result is surprisingly compact -- each server weighed in 
at about 40 lines, and the common Coroutine class used by both adds 
another 60-something lines.


In the limit case, it appears that any WSGI 1 server could provide an 
(emulated) async WSGI2 implementation, simply by wrapping WSGI2 apps 
with a finished version of the decorator in my sketch.


Or, since users could do it themselves, this would mean that WSGI2 
deployment wouldn't be dependent on all server implementers 
immediately turning out their own WSGI2 implementations.


True async API implementations would be more involved, of course -- 
using a WSGI2 decorator on say, Twisted's WSGI1 implementation, would 
give you no performance advantages vs. using Twisted's APIs 
directly.  But, as soon as someone wrote a Twisted-specific 
translation of my async-server sketch, such an app would be portable.


More discussion is still needed, but at this point I'm convinced the 
concept is *technically* feasible.  (Whether there's enough need in 
the "market" to make it worthwhile, is a separate question.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea

2011-01-08 Thread P.J. Eby

At 01:24 PM 1/8/2011 -0500, Paul Davis wrote:

For contrast, I thought it might be beneficial to have a comparison
with an implementation that didn't use async might look like:

http://friendpaste.com/4lFbZsTpPGA9N9niyOt9PF


Compare your version with this one, that uses my revision of Alice's proposal:

def my_awesome_application(environ):
# do stuff
yield b'200 OK', [], ["Hello, World!"]

def my_middleware(app):
def wrapper(environ):
# maybe edit environ
try:
status, headers, body = yield app(environ)
# maybe edit response:
# body = latinize(body)
yield status, headers, body
except:
# maybe handle error
finally:
# maybe release resources

def my_server(app, httpreq):
environ = wsgi.make_environ(httpreq)

def process_response(result):
status, headers, body = result
write_headers(httpreq, status, headers)
Coroutine(body, body_trampoline, finish_response)

def finish_response(result):
# cleanup, if any

Coroutine(app(environ), app_trampoline, process_response)


The primary differences are that the server needs to split some of 
its processing into separate routines, and response-processing done 
by middleware has to happen in a while loop rather than a for loop.




If your implementation requires that people change source code (yield
vs return) when they move code between sync and async servers, doesn't
that pretty much violate the main WSGI goal of portability?


The idea here would be to have WSGI 2 use this protocol exclusively, 
not to have two different protocols.




IMO, the async middleware is quite more complex than the current state
of things with start_response.


Under the above proposal, it isn't, since you can't (only) do a for 
loop over the response body; you have to write a loop and a 
push-based handler as well.  In this case, it is reduced to just 
writing one loop.


I'm still not entirely convinced of the viability of the approach, 
but I'm no longer in the "that's just crazy talk" category regarding 
an async WSGI.  The cost is no longer crazy, but there's still some 
cost involved, and the use case rationale hasn't improved much.


OTOH, I can now conceive of actually *using* such an async API for 
something, and that's no small feat.  Before now, the idea held 
virtually zero interest for me.




Either way this proposal reminds me quite a bit of Duff's device [1].
On its own Duff's device is quite amusing and could even be employed
in some situations to great effect. On the other hand, any WSGI spec
has to be understandable and implementable by people from all skill
ranges. If its a spec that only a handful of people comprehend, then I
fear its adoption would be significantly slowed in practice.


Under my modification of Alice's proposal, nearly all of the 
complexity involved migrates to the server, mostly in the (shareable) 
Coroutine implementation.


For an async server, the "arrange for coroutine(result) to be called" 
operations are generally native to async APIs, so I'd expect them to 
find that simple to implement.  Synchronous servers just need to 
invoke the waited-on operation synchronously, then pass the value 
back into the coroutine.  (e.g. by returning "pause" from the 
trampoline, then calling coroutine(value, exc_info) to resume 
processing after the result is obtained.)



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea

2011-01-08 Thread P.J. Eby

At 05:39 AM 1/8/2011 -0800, Alice Bevan­McGregor wrote:
As a quick note, this proposal would signifigantly benefit from the 
simplified syntax offered by PEP 380 (Syntax for Delegating to a 
Subgenerator) [1] and possibly PEP 3152 (Cofunctions) [2].  The 
former simplifies delegation and exception passing, and the latter 
simplifies the async side of this.


Unfortunately, AFIK, both are affected by PEP 3003 (Python Language 
Moratorium) [3], which kinda sucks.


Luckily, neither PEP is necessary, since we do not need to support 
arbitrary protocols for the "subgenerators" being called.  This makes 
it possible to simply "yield" instead of "yield from", and the 
trampoline functions take care of distinguishing a terminal 
("return") result from an intermediate one.


The Coroutine class I suggested, however, *does* accept explicit 
returns via "raise StopIteration(value)", so it is actually fully 
equivalent to supporting "yield from", as long as it's used with an 
appropriate trampoline function.


(In fact, the structure of the Coroutine class I proposed was stolen 
from an earlier Python-Dev post I did in an attempt to show why PEP 
380 was unnecessary for doing coroutines.  ;-) )


In effect, the only thing that PEP 380 would add here is the syntax 
sugar for 'raise StopIteration(value)', but you can do that with:


def return_(value):
raise StopIteration(value)

In any case, my suggestion doesn't need this for either apps or 
response bodies, since the type of data yielded suffices to indicate 
whether the value is a "return" or not.  You only need a subgenerator 
to raise StopIteration if you want to return something to your caller 
that *isn't* a response or body chunk.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea

2011-01-08 Thread P.J. Eby

I made a few errors in that massive post...

At 12:00 PM 1/8/2011 -0500, P.J. Eby wrote:
My major concern about the approach is still that it requires a fair 
amount of overhead on the part of both app developers and middleware 
developers, even if that overhead mostly consists of importing and 
decorating.  (More below.)


The above turned out to be happily wrong by the end of the post, 
since no decorators or imports are actually required for app and 
middleware developers.




You can then implement response-processing middleware like this:

def latinize_body(body_iter):
while True:
chunk = yield body_iter
if chunk is None:
break
else:
yield piglatin(yield body_iter)


The last line above is incorrect; it should've been "yield 
piglatin(chunk)", i.e.:


def latinize_body(body_iter):
while True:
chunk = yield body_iter
if chunk is None:
break
else:
yield piglatin(chunk)

It's still rather unintuitive, though.  There are also plenty of 
topics left to discuss, both of the substantial and bikeshedding varieties.


One big open question still in my mind is, are these middleware 
idioms any easier to get right than the WSGI 1 ones?  For things that 
don't process response bodies, the answer seems to be yes: you just 
stick in a "yield" and you're done.


For things that DO process response bodies, however, you have to have 
ugly loops like the one above.


I suppose it could be argued that, as unintuitive as that 
body-processing loop is, it's still orders of magnitude more 
intuitive than a piece of WSGI 1 middleware that has to handle both 
application yields and write()s!


I suppose my hesitance is due to the fact that it's not as simple as:

return (piglatin(chunk) for chunk in body_iter)

Which is really the level of simplicity that I was looking 
for.  (IOW, all response-processing middleware pays in this 
slightly-added complexity to support the subset of apps and 
response-processing middleware that need to wait for events during 
body output.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea

2011-01-08 Thread P.J. Eby

At 03:26 AM 1/8/2011 -0800, Alice Bevan­McGregor wrote:
Warning: this assumes we're running on bizzaro-world PEP 444 that 
mandates applications are generators.  Please do not dismiss this 
idea out of hand but give it a good look and maybe some feedback.  ;)


First-glance feedback: I'm impressed.  You may have something going 
here after all.  I just wish you'd sent this sooner.  ;-)


I can easily see why I didn't think of this myself: I hadn't shifted 
my thinking to accomodate for two important changes in the Python 
environment since the first WSGI spec, circa 2003-04:


1. Coroutines and decorators are ubiquitous and non-intrusive
2. WSGI has stdlib support, and in any event it is much easier to 
rely on non-stdlib packages


My major concern about the approach is still that it requires a fair 
amount of overhead on the part of both app developers and middleware 
developers, even if that overhead mostly consists of importing and 
decorating.  (More below.)



The second middleware demonstration (using a decorator) makes 
middleware look a lot more like an application: yielding futures, or 
a response, with the addition of yielding an application callable 
not explored in the first (long, but trivial) example.  I believe 
this should cover 99% of middleware use cases, including interactive 
debugging, request routing, etc. and the syntax isn't too bad, if 
you don't mind standardized decorators.


If we assume that the implementation would be in a wsgi2ref for 
Python 3.3 and distributed standalone for 2.x, I think we can make 
something work.  (In the sense of practical to implement, not 
necessarily *desirable*.)


One of my goals is that it should be possible to write "async-naive" 
applications and middleware, so that people who don't care about 
async can ignore it.


On the application side, this is easy: a trivial decorator suffices 
to translate a return into a yield.


For middleware, it's not quite as simple, unless you have a pure 
ingress or egress filter, since you can't simply "call" the 
application.  However, a "context manager"-like pattern applies, 
wherein you could simply yield to calling a wrapped version of the application.


Hm.  This seems to pretty much generalize to a standard 
coroutine/trampoline pattern, where the server provides the 
trampoline, and can provide APIs in the environ to create waitable 
objects that can be yielded upward.


Actually, this is kind of like what I really wanted the futures PEP 
to be about.  And it also preserves composability nicely.


In fact, it doesn't actually need any middleware decorators, if the 
server provides the trampoline.


We would leave your "my_awesome_application" example intact (possibly 
apart from having a friendlier API for reading from wsgi.input), but 
change my_middleware as follows:


   def my_middleware(app):
   def wrapper(environ):
   # pre-response code here
   response = yield app(environ)
   # post-response code here
   yield altered_response
   return wrapper

That's it.  No decorators, no nothing.

The server-level trampoline is then just a function that looks 
something like this:


def app_trampoline(coroutine, yielded):
if [yielded is a future of some sort]:
[arrange to invoke 'coroutine(result)' upon completion]
[arrange to inovke 'coroutine(None, exc_info)' upon error]
return "pause"
elif [yielded is a response]:
return "return"
elif [yielded has send/throw methods]:
return "call"  # tell the coroutine to call it
else:
raise TypeError

The trampoline function is used with a coroutine class like this:

class Coroutine:

def __init__(self, iterator, trampoline, callback):
self.stack = [iterator]
self.trampoline = trampoline
self()

def __call__(self, value=None, exc_info=()):
stack = self.stack
while stack:
try:
it = stack[-1]
if exc_info:
try:
rv = it.throw(*exc_info)
finally:
exc_info = ()
else:
rv = it.send(value)
except BaseException:
value = None
exc_info = sys.exc_info()
if exc_info[0] is StopIteration:
# pass return value up the stack
value, = exc_info[1].args or (None,)
exc_info = ()   # but not the error
stack.pop()
else:
switch = self.trampoline(self, rv)
if switch=="pause":
return
elif switch=="call":
stack.append(rv)  # Call subgenerator
value, exc_info = None

Re: [Web-SIG] PEP 444 Goals

2011-01-07 Thread P.J. Eby

At 01:22 PM 1/7/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-07 08:28:15 -0800, P.J. Eby said:

At 01:17 AM 1/7/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-06 20:18:12 -0800, P.J. Eby said:
:: Reduction of re-implementation / NIH syndrome 
by>>>incorporating>the most common (1%) of features most 
often>>>relegated to middleware>or functional helpers.
Note that nearly every application-friendly feature you add 
will>>increase the burden on both server developers and 
middleware>>developers, which ironically means that application 
developers>>actually end up with fewer options.

Some things shouldn't have multiple options in the first place.  ;)
I meant that if a server doesn't implement the spec because of 
arequired feature, then the app developer doesn't have the option 
of using that feature anyway -- meaning that adding the feature to 
the spec didn't really help.


I truly can not worry about non-conformant applications, middleware, 
or servers and still keep my hair.


I said "if a server doesn't implement the *spec*", meaning, they 
choose not to support PEP 444 *at all*, not that they skip providing 
the feature.



Easy enough to write quick, say, 10-line utility functions that 
arecorrect middleware -- so that you could actually build 
yourapplication out of WSGI functions calling other WSGI-based functions.

The yielding thing wouldn't work for that at all.


Handling a possible generator isn't that difficult.


That it's difficult at all means removes degree-of-difficulty as a 
strong motivation to switch.




So, in order to know what type each CGI variable is, you'll need a reference?


Reference?  Re-read what I wrote.  Only URI-specific values utilize 
an encoding reference variable in the environment; that's four 
values out of the entire environ.  There is one, clearly defined 
bytes value.  The rest are native strings, decoded using 
latin1/iso-8859-1/"str-in-unicode" where native strings are unicode.


IOW, there are six specific facts someone needs to remember in order 
to know the type of a given CGI variable, over and above the mere 
fact that it's a CGI variable.  Hence, "reference".


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 / WSGI 2 Async

2011-01-07 Thread P.J. Eby

At 12:37 PM 1/7/2011 -0800, Alice Bevan­McGregor wrote:
But is there really any problem with providing a unified method for 
indication a suspend point?


Yes: a complexity burden that is paid by the many to serve the few -- 
or possibly non-existent.


I still haven't seen anything that suggests there is a large enough 
group of people who want a "portable" async API to justify 
inconveniencing everyone else in order to suit their needs, vs. 
simply having a different calling interface for that need.


If I could go back and change only ONE thing about WSGI 1, it would 
be the calling convention.  It was messed up from the start, 
specifically because I wasn't adamant enough about weighing the needs 
of the many enough against the needs of the few.  Only a few needed a 
push protocol (write()), and only a few even remotely cared about our 
minor nod to asynchrony (yielding empty strings to pause output).


If I'd been smart (or more to the point, prescient), I'd have just 
done a 3-tuple return value from the get-go, and said to hell with 
those other use cases, because everybody else is paying to carry a 
few people who aren't even going to use these features for real.  (As 
it happens, I thought write() would be needed in order to drive 
adoption, and it may well have been at one time.)


Anyway, with a new spec we have the benefit of hindsight: we know 
that, historically, nobody has actually cared enough to propose a 
full-blown async API who wasn't also trying to make their async 
server implementation work without needing threads.  Never in the 
history of the web-sig, AFAIK, has anyone come in and said, "hey, I 
want to have an async app that can run on any async framework."


Nobody blogs or twitters about how terrible it is that the async 
frameworks all have different APIs and that this makes their apps 
non-portable.  We see lots of complaints about not having a Python 3 
WSGI spec, but virtually none about WSGI being essentially synchronous.


I'm not saying there's zero audience for such a thing...  but then, 
at some point there was a non-zero audience for write() and for 
yielding empty strings.  ;-)


The big problem is this: if, as an app developer, you want this 
hypothetical portable async API, you either already have an app that 
is async or you don't.  If you do, then you already got married to 
some particular API and are happy with your choice -- or else you'd 
have bit the bullet and ported.


What you would not do, is come to the Web-SIG and ask for a spec to 
help you port, because you'd then *still have to port* to the new 
API...  unless of course you wanted it to look like the API you're 
already using...  in which case, why are you porting again, exactly?


Oh, you don't have an app...  okay, so *hypothetically*, if you had 
this API -- which, because you're not actually *using* an async API 
right now, you probably don't even know quite what you need -- 
hypothetically if you had this API you would write an app and then 
run it on multiple async frameworks...


See?  It just gets all the way to silly.  The only way you can 
actually get this far in the process seems to be if you are on the 
server side, thinking it would be really cool to make this thing 
because then surely you'll get users.


In practice, I can't imagine how you could write an app with 
substantial async functionality that was sanely portable across the 
major async frameworks, with the possible exception of the two that 
at least share some common code, paradigms, and API.  And even if you 
could, I can't imagine someone wanting to.


So far, you have yet to give a concrete example of an application 
that you personally (or anyone you know of) want to be able to run on 
two different servers.  You've spoken of hypothetical apps and 
hypothetical portability...  but not one concrete, "I want to run 
this under both Twisted and Eventlet" (or some other two 
frameworks/servers), "because of [actual, non-hypothetical rationale here]".


I don't deny that [actual non-hypothetical rationale] may exist 
somewhere, but until somebody shows up with a concrete case, I don't 
see a proposal getting much traction.  (The alternative would be if 
you pull a rabbit out of your hat and propose something that doesn't 
cost anybody anything to implement... but the fact that you're 
tossing the 3-tuple out in favor of yielding indicates you've got no 
such proposal ready at the present time.)


On the plus side, the "run this in a future after the request" 
concept has some legs, and I hope Timothy (or anybody) takes it and 
runs with it.  That has plenty of concrete use cases for portability 
-- every sufficiently-powerful web framework will want to either 
provide that feature, build other features on top of it, or both.


It's the "make the request itself async" part that's the hard sell 
here, and in need of some truly spectacular rationale in order to 
justify the ubiquitous costs it imposes.


___

Re: [Web-SIG] PEP 444 feature request - Futures executor

2011-01-07 Thread P.J. Eby

At 11:47 AM 1/7/2011 -0600, Timothy Farrell wrote:
There has been much discussion about how to handle async in PEP 444 
and that discussion centers around the use of futures.  However, I'm 
requesting that servers _optionally_ provide 
environ['wsgi.executor'] as a futures executor that applications can 
use for the purpose of doing something after the response is fully 
sent to the client.  This is feature request is designed to be 
concurrency methodology agnostic.


Some example use cases are:

- send an email that might block on a slow email server (Alice, I 
read what you said about Turbomail, but one product is not the 
solution to all situations)

- initiate a database vacuum
- clean a cache
- build a cache
- compile statistics

When serving pages of an application, these are all things that 
could be done after the response has been sent.  Ideally these 
things don't need to be done in a request thread and aren't 
incredibly time-sensitive.  It seems to me that futures would be an 
ideal way of handling this.


Thoughts?


This seems like a potentially good way to do it; I suggest making it 
a wsgi.org extension; see (and update) 
http://www.wsgi.org/wsgi/Specifications with your proposal.


I would suggest including a simple sample executor wrapper that 
servers could use to block all but the methods allowed by your 
proposal.  (i.e., presumably not shutdown(), for example.)


There are some other issues that might need to be addressed, like 
maybe adding an attribute or two for the level of reliability 
guaranteed by the executor, or allowing the app to request a given 
reliability level.  Specifically, it might be important to distinguish between:


* this will be run exactly once as long as the server doesn't crash
* this will eventually be run once, even if the server suffers a 
fatal error between now and then


IOW, to indicate whether the thing being done is "transactional", so to speak.

I mean, I can imagine building a transactional service on top of the 
basic service, by queuing task information externally, then just 
using executor calls to pump the queue.  But IMO it seems pretty 
intrinsic to want that kind of persistence guarantee for at least the 
email case, or, say, sending off a charge to a credit card or 
something like that.


One other relevant use case: sometimes you want a long-running 
process step that the user checks back in on periodically, so having 
a way to get a "handle" for a future that can be kept in a session or 
something might be important.  Like, say, you're preparing a report 
that will be viewed in the browser, and using meta-refresh or some 
such to poll.  The app needs to check on a previously queued future 
and get its results.


I don't know how easy any of the above are to implement with the 
futures API or your proposal, but they seem like worthwhile things to 
have available, and actually would provide for some rich application 
use cases.  But if they're implementable over the futures API at all, 
it should be possible to implement them as WSGI 1.x middleware or as 
a server extension.


A spec like that definitely needs some thrashing out, but I don't 
think it need derail any PEPs in progress: the API of such an 
extension doesn't affect the basic WSGI protocol at all. 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 Goals

2011-01-07 Thread P.J. Eby

At 01:17 AM 1/7/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-06 20:18:12 -0800, P.J. Eby said:
:: Reduction of re-implementation / NIH syndrome by 
incorporating>the most common (1%) of features most often 
relegated to middleware>or functional helpers.
Note that nearly every application-friendly feature you add will 
increase the burden on both server developers and middleware 
developers, which ironically means that application developers 
actually end up with fewer options.


Some things shouldn't have multiple options in the first place.  ;)


I meant that if a server doesn't implement the spec because of a 
required feature, then the app developer doesn't have the option of 
using that feature anyway -- meaning that adding the feature to the 
spec didn't really help.



  I definitely consider implementation overhead on server, 
middleware, and application authors to be important.


As an example, if yield syntax is allowable for application objects 
(as it is for response bodies) middleware will need to iterate over 
the application, yielding up-stream anything that isn't a 
3-tuple.  When it encounters a 3-tuple, the middleware can do its 
thing.  If the app yield semantics are required (which may be a good 
idea for consistency and simplicity sake if we head down this path) 
then async-aware middleware can be implemented as a generator 
regardless of the downstream (wrapped) application's implementation. 
That's not too much overhead, IMHO.


The reason I proposed the 3-tuple return in the first place (see 
http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html 
) was that I wanted to make middleware *easy* to write.


Easy enough to write quick, say, 10-line utility functions that are 
correct middleware -- so that you could actually build your 
application out of WSGI functions calling other WSGI-based functions.


The yielding thing wouldn't work for that at all.


Unicode decoding of a small handful of values (CGI values that> 
pull from the request URI) is the biggest example. [2, 3]
Does that mean you plan to make the other values bytes, then?  Or 
will they be unicode-y-bytes as well?


Specific CGI values are bytes (one, I believe), specific ones are 
true unicode (URI-related values) and decoded using a configurable 
encoding with a fallback to "bytes in unicode" (iso-8859-1/latin1), 
are kept internally consistent (if any one fails, treat as if they 
all failed), have the encoding used recorded in the environ, and all 
others are native strings ("bytes in unicode" where native strings 
are unicode).


So, in order to know what type each CGI variable is, you'll need a reference?



What happens for additional server-provided variables?


That is the domain of the server to document, though native strings 
would be nice.  (The PEP only covers CGI variables.)


I mean the ones required by the spec, not server-specific extensions.



The PEP  choice was for uniformity.  At one point, I advocated 
simply using surrogateescape coding, but this couldn't be made 
uniform across Python versions and maintain compatibility.


As an open question to anyone: is surrogateescape availabe in Python 
2.6?  Mandating that as a minimum version for PEP 444 has yielded 
benefits in terms of back-ported features and syntax, like b''.


No, otherwise I'd totally go for the surrogateescape approach.  Heck, 
I'd still go for it if it were possible to write a surrogateescape 
handler for 2.6, and require that a PEP 444 server register one with 
Python's codec system.  I don't know if it's *possible*, though, 
hopefully someone with more knowledge can weigh in on that.



:: Cross-compatibility considerations.  The definition and use 
of>native strings vs. byte strings is the biggest example of this 
in the rewrite.
I'm not sure what you mean here.  Do you mean "portability of WSGI 
2code samples across Python versions (esp. 2.x vs. 3.x)?"


It should be possible (and currently is, as demonstrated by 
marrow.server.http) to create a polygot server, polygot 
middleware/filters (demonstrated by marrow.wsgi.egress.compression), 
and polygot applications, though obviously polygot code demands the 
"lowest common denominator" in terms of feature use.  Application / 
framework authors would likely create Python 3 specific WSGI 
applications to make use of the full Python 3 feature set, with 
cross-compatibility relegated to server and middleware authors.


I'm just asking whether, in your statement of goals and rationale, 
you would expand "cross compatibility" as meaning cross-python 
version portability, or whether you meant something else.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 / WSGI 2 Async

2011-01-07 Thread P.J. Eby

At 12:39 AM 1/7/2011 -0800, Alice Bevan­McGregor wrote:
Earlier in this post I illustrated a few that directly apply to a 
commercial application I am currently writing.  I'll elaborate:


:: Image scaling would benefit from multi-processing (spreading the 
load across cores). Also, only one sacle is immediately required 
before returning the post-upload page: the thumbnail.  The other 
scales can be executed without halting the WSGI application's return.


:: Asset content extraction and indexing would benefit from 
threading, and would also not require pausing the WSGI application.


:: Since most templating engines aren't streaming (see my unanswered 
thread in the general mailing list re: this), pausing the 
application pending a particularly difficult render is a boon to 
single-threaded async servers, though true streaming templating 
(with flush semantics) would be the holy grail.  ;)


In all these cases, ISTM the benefit is the same if you future the 
WSGI apps themselves (which is essentially what most current async 
WSGI servers do, AFAIK).




:: Long-duration calls to non-async-aware libraries such as DB access.
The WSGI application could queue up a number of long DB queries, 
pass the futures instances to the template, and the template could 
then .result() (block) across them or yield them to be suspended and 
resumed when the result is available.


:: True async is useful for WebSockets, which seem a far superior 
solution to JSON/AJAX polling in addition to allowing real web-based 
socket access, of course.


The point as it relates to WSGI, though, is that there are plenty of 
mature async APIs that offer these benefits, and some of them (e.g. 
Eventlet and Gevent) do so while allowing blocking-style code to be 
written.  That is, you just make what looks like a blocking call, but 
the underlying framework silently suspends your code, without tying 
up the thread.


Or, if you can't use a greenlet-based framework, you can use a 
yield-based framework.  Or, if for some reason you really wanted to 
write continuation-passing style code, you could just use the raw Twisted API.


But in all of these cases you would be better off than if you used a 
half-implementation of the same thing using futures under WSGI, 
because all of those frameworks already have mature and sophisticated 
APIs for doing async communications and DB access.  If you try to do 
it with WSGI under the guise of "portability", all this means is that 
you are stuck rolling your own replacements for those existing APIs.


Even if you've already written a bunch of code using raw sockets and 
want to make it asynchronous, Eventlet and Gevent actually let you 
load a compatibility module that makes it all work, by replacing the 
socket API with an exact duplicate that secretly suspends your code 
whenever a socket operation would block.


IOW, if you are writing a truly async application, you'd almost have 
to be crazy to want to try to do it *portably*, vs. picking a 
full-featured async API and server suite to code against.  And if 
you're migrating an existing, previously-synchronous WSGI app to 
being asynchronous, the obvious thing to do would just be to grab a 
copy of Eventlet or Gevent and import the appropriate compatibility 
modules, not rewrite the whole thing to use futures.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-07 Thread P.J. Eby

At 05:27 PM 1/7/2011 +1100, Graham Dumpleton wrote:

Another thing though. For output changed to sys.stdout.buffer. For
input should we be using sys.stdin.buffer as well if want bytes?


%&$*()&%!!!  Sorry, still getting used to this whole Python 3 
thing.  (Honestly, I don't even use Python 2.6 for anything real yet.)




Good thing I tried running this. Did we all assume that someone else
was actually running it to check it? :-)


Well, I only recently started changing the examples to actual Python 
3, vs being the old Python 2 examples.  Though, I'm not sure anybody 
ever ran the Python 2 ones.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-07 Thread P.J. Eby

At 05:13 PM 1/7/2011 +1100, Graham Dumpleton wrote:

The version at:

http://svn.python.org/projects/peps/trunk/pep-.txt

still shows:

elif not headers_sent:
 # Before the first output, send the stored headers
 status, response_headers = headers_sent[:] = headers_set
 sys.stdout.write('Status: %s\r\n' % status)
 for header in response_headers:
 sys.stdout.write('%s: %s\r\n' % header)
 sys.stdout.write('\r\n')

so not using buffer there and also not converting strings written for
headers to bytes.


Fixed in SVN now.  The main issue now is that we need to fix the 
re-raises and error handling for Python 3, in the text and examples.


I also found some references for with_traceback() and I think I've 
got that sorted now, but if someone can check my work that'd be a good idea.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-07 Thread P.J. Eby

At 05:00 PM 1/7/2011 +1100, Graham Dumpleton wrote:

Stupid question first. When running 2to3 on the example CGI code,


Don't do that.  It's supposed to already be Python 3 code.  ;-)

It did, however, reveal a bug where I have not in fact done the 
correct Python 3 thing:



 if headers_sent:
 # Re-raise original exception if headers sent
-raise exc_info[0], exc_info[1], exc_info[2]
+raise 
exc_info[0](exc_info[1]).with_traceback(exc_info[2])

 finally:
 exc_info = None # avoid dangling circular ref


Can somebody weigh in on what the correct translation here is?  The 
only real Python 3 coding I've done to date has been experiments to 
test changes to other aspects of WSGI.  ;-)



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 / WSGI 2 Async

2011-01-06 Thread P.J. Eby

At 05:47 PM 1/6/2011 -0800, Alice Bevan­McGregor wrote:
Tossing the idea around all day long will then, of course, be 
happening regardless.  Unfortunately for that particular discussion, 
PEP 3148 / Futures seems to have won out in the broader scope.


Do any established async frameworks or server (e.g. Twisted, 
Eventlet, Gevent,  Tornado, etc.) make use of futures?



  Having a ratified and incorporated language PEP (core in 3.2 w/ 
compatibility package for 2.5 or 2.6+ support) reduces the scope of 
async discussion down to: "how do we integrate futures into WSGI 2" 
instead of "how do we define an async API at all".


It would be helpful if you addressed the issue of scope, i.e., what 
features are you proposing to offer to the application developer.


While the idea of using futures presents some intriguing 
possibilities, it seems to me at first glance that all it will do is 
move the point where the work gets done.  That is, instead of simply 
running the app in a worker, the app will be farming out work to 
futures.  But if this is so, then why doesn't the server just farm 
the apps themselves out to workers?


I guess what I'm saying is, I haven't heard use cases for this from 
the application developer POV -- why should an app developer care 
about having their app run asynchronously?


So far, I believe you're the second major proponent (i.e. ones with 
concrete proposals and/or implementations to discuss) of an async 
protocol...  and what you have in common with the other proponent is 
that you happen to have written an async server that would benefit 
from having apps operating asynchronously.  ;-)


I find it hard to imagine an app developer wanting to do something 
asynchronously for which they would not want to use one of the 
big-dog asynchronous frameworks.  (Especially if their app involves 
database access, or other communications protocols.)


This doesn't mean I think having a futures API is a bad thing, but 
ISTM that a futures extension to WSGI 1 could be defined right now 
using an x-wsgi-org extension in that case...  and you could then 
find out how many people are actually interested in using it.


Mainly, though, what I see is people using the futures thing to 
shuffle off compute-intensive tasks...  but if they do that, then 
they're basically trying to make the server's life easier...  but 
under the existing spec, any truly async server implementing WSGI is 
going to run the *app* in a "future" of some sort already...


Which means that the net result is that putting in async is like 
saying to the app developer: "hey, you know this thing that you just 
could do in WSGI 1 and the server would take care of it for 
you?  Well, now you can manage that complexity by yourself!  Isn't 
that wonderful?"   ;-)


I could be wrong of course, but I'd like to see what concrete use 
cases people have for async.  We dropped the first discussion of 
async six years ago because someone (I think it might've been James) 
pointed out that, well, it isn't actually that useful.  And every 
subsequent call for use cases since has been answered with, "well, 
the use case is that you want it to be async."


Only, that's a *server* developer's use case, not an app developer's 
use case...  and only for a minority of server developers, at that.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread P.J. Eby

At 09:51 AM 1/7/2011 +1100, Graham Dumpleton wrote:

Is that the last thing or do I need to go spend some time and write my
own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have
lying around and just do some final validation checks with a parallel
implementation as a sanity check to make sure we got everything? This
might be a good idea anyway.


It would.  In the meantime, though, I've checked in the two-line 
change to add .buffer in.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread P.J. Eby

At 10:33 AM 1/4/2011 -0800, Guido van Rossum wrote:

But the flush() I was referring to is actually *before* either of
these, suggesting

sys.stdout.flush()
sys.stdout.buffer.write(data)
sys.stdout.buffer.flush()

However the first flush() is only necessary is there's a possibility
that previously something had been written to sys.stdout (not to
sys.stdout.buffer).


Yeah, that sort of error checking seems out of scope for a PEP 
example.  If something was written to sys.stdout at that point, your 
CGI was already broken.  ;-)




> For the CGI example in the PEP, I don't want to bother trying to make it
> fully production-usable; that's what we have wsgiref in the stdlib for.  So
> I won't worry about mixing strings and regular output in the example, even
> if perhaps wsgiref should add the StringIO's proposed by Graham.

Not sure what you mean. Surely copying and pasting the examples should
work? What are the details you'd like to leave out?


I meant that a production CGI gateway should handle various 
boundary/error conditions.  Graham was saying that a CGI gateway 
should replace sys.stdout to avoid stray print()s causing problems, 
and I consider that similar to saying, "what if somebody wrote text 
to sys.stdout?" -- i.e., an error handling case that would be a good 
idea in a production gateway, but which would obscure the common case 
that the example is meant to illustrate.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread P.J. Eby

At 12:52 PM 1/6/2011 -0800, Alice Bevan­McGregor wrote:

Ignoring async for the moment, the goals of the PEP 444 rewrite are:

:: Clear separation of "narrative" from "rules to be 
followed".  This allows developers of both servers and applications 
to easily run through a confomance "check list".


:: Isolation of examples and rationale to improve readability of the 
core rulesets.


:: Clarification of often mis-interpreted rules from PEP 333 (and 
those carried over in ).


:: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage.

:: Massive simplification of call flow.  Replacing start_response 
with a returned 3-tuple immensely simplifies the task of middleware 
that needs to capture HTTP status or manipulate (or even examine) 
response headers. [1]


A big +1 to all the above as goals.


:: Reduction of re-implementation / NIH syndrome by incorporating 
the most common (1%) of features most often relegated to middleware 
or functional helpers.


Note that nearly every application-friendly feature you add will 
increase the burden on both server developers and middleware 
developers, which ironically means that application developers 
actually end up with fewer options.



  Unicode decoding of a small handful of values (CGI values that 
pull from the request URI) is the biggest example. [2, 3]


Does that mean you plan to make the other values bytes, then?  Or 
will they be unicode-y-bytes as well?  What happens for additional 
server-provided variables?


The PEP  choice was for uniformity.  At one point, I advocated 
simply using surrogateescape coding, but this couldn't be made 
uniform across Python versions and maintain compatibility.


Unfortunately, even with the move to 2.6+, this problem remains, 
unless server providers are required to register a surrogateescape 
error handler -- which I'm not even sure can be done in Python 2.x.



:: Cross-compatibility considerations.  The definition and use of 
native strings vs. byte strings is the biggest example of this in the rewrite.


I'm not sure what you mean here.  Do you mean "portability of WSGI 2 
code samples across Python versions (esp. 2.x vs. 3.x)?"


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 / WSGI 2 Async

2011-01-06 Thread P.J. Eby

At 01:03 PM 1/6/2011 +, chris.d...@gmail.com wrote:

Does that apply here? It seems you either allow unicode strings or you
don't, not a certain subsection.


That's why PEP  requires bytes instead - only the application 
knows what it's sending, and the server and middleware shouldn't have to guess.




My general rule is unicode inside, UTF-8 at the boundaries.


Which would be easy to enforce if you can only yield bytes, as is the 
case with PEP .


I worry a bit that right now, there may be Python 3.2 servers (other 
than the ones built on wsgiref.handlers) that may not be enforcing 
this rule yet. 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-04 Thread P.J. Eby

At 08:53 PM 1/4/2011 +1100, Graham Dumpleton wrote:

BTW, to what extent are the examples in the PEP meant to be able to
work on both Python 2.X and Python 3.X as is.

Does it need to be clarified where examples will only work on Python
3.X, in particular the CGI gateway.


The intention is that PEP  will have examples specific to Python 
3 in future.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-04 Thread P.J. Eby

At 09:51 PM 1/4/2011 +1100, Graham Dumpleton wrote:

Add another point. FWIW, these are coming up because of questions
being asked on python-dev IRC channel about PEP .

The issue as it came down to was that the PEP may not be clear enough
in explaining that where str() is unicode and as such something like
PATH_INFO, although unicode, is actually bytes decoded as ISO-8859-1,
needed to be re encoded/decoded to get it back to Unicode in the
charset required before use.

They were thinking that because it was unicode already they could use
it as is and not need to do anything. Ie., didn't realise that need to
do:

  path_info = environ.get('PATH_INFO', '')
  path_info = path_info.encode('ISO-8859-1').decode('UTF-8')

for example to get it interpreted as UTF-8 first. They were simply
looking at concatenating new URL bits to the ISO-8859-1 variant from
other unicode strings that weren't bytes represented as ISO-8859-1.

In Python 2.X it was obvious that since it wasn't unicode that you had
to decode it, but confusion may arise for Python 3.X if this
requirement is not explicitly spelled out with a code example like
above.

We all may see it as obvious and yes perhaps it could be covered in
separate articles or commentaries be people, but given this person was
new to it, maybe it is deserving of more explanation in the PEP itself
if they were confused.


It would be really awesome if somebody would write separate 
Application Authors' Guide and Middleware Authors' Guides to 
WSGI.  They don't need to know absolutely everything in the PEP, 
unlike server authors.




It could also be that the PEP covers it adequately already. I am too
tired to read through it again right now.


It's pretty prominently stated early on that NO strings in the spec 
are really unicode, they're just bytes packed into unicode objects.


Obviously, no matter how prominently this is stated, some people will 
still make this mistake, but if desired, we could always put some 
additional info near the environ part of the spec for clarification.


(It occurs to me in retrospect that I should probably have updated 
wsgiref in the stdlib to check the bytesy-ness of strings used to 
create Header objects.  Too late for 3.2, though.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] CGI in PEP 444

2011-01-04 Thread P.J. Eby

At 12:43 PM 1/4/2011 +, Antoine Pitrou wrote:
Alice Bevan­McGregor  writes: > > [1] 
http:://bit.ly/e7rtI6 So, while we are at it, could we get rid of 
the "CGI server example" in this new SWGI spec? This is 2011, and we 
should promote modern idioms, not encourage people to do 1995 Web 
programming. 10 years ago, CGI was already frown upon. (and even the 
idea that WSGI should provide some kind of CGI compatibility sounds 
a bit ridiculous to me) Regards Antoine.


I still use CGI for the odd one-off, testing, prototyping, etc., and 
it's by far the easiest thing to deploy on a lot of web hosts.  Hell, 
even Google App Engine *emulates* CGI in its default deployment 
configuration, IIRC.  So it's not exactly obsolete.


Also, the main purpose of the example is to show what a web server 
developer needs to do to hook up their own piping to provide WSGI 
services...  and most web server developers have something like CGI 
code already lying around, or at least know what CGI looks like.





 ___ Web-SIG mailing 
list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig 
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/pje%40telecommunity.com


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-04 Thread P.J. Eby

At 06:30 PM 1/3/2011 -0800, Guido van Rossum wrote:

Would

  sys.stdout.buffer.write(b'abc')

do?

(If you mix this with writing strings to sys.stdout directly, you may
have to call sys.stdout.flush() first.)


The current code is:

sys.stdout.write(data)  # TODO: this needs to be binary on Py3
sys.stdout.flush()

Should I be using sys.stdout.buffer for both, or just the write?

For the CGI example in the PEP, I don't want to bother trying to make 
it fully production-usable; that's what we have wsgiref in the stdlib 
for.  So I won't worry about mixing strings and regular output in the 
example, even if perhaps wsgiref should add the StringIO's proposed by Graham.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-03 Thread P.J. Eby

At 08:04 PM 1/3/2011 -0500, Randy Syring wrote:

In the server/gateway example, there is a comment in the code that says:

# TODO: this needs to be binary on Py3

The "TODO" part confuses me.  In other areas of the PEP, there are 
comments like:


# call must be byte-safe on Py3

which make sense.  But is the TODO meant to be a TODO for the PEP or 
is it meant to be a note to the person running the example on 
Py3.  If the latter, maybe "TODO" isn't the best prefix.


FWIW, don't consider this an objection, it is just a question I had 
as I read through the PEP.


Those are my TODO's for the PEP itself, and I've fixed a couple of 
them in SVN (probably around the time you were writing the 
above).  If somebody can point me to the proper Py3 incantation for 
writing bytes to stdout, I'll fix the one remaining TODO marker as well.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-03 Thread P.J. Eby

At 04:43 PM 1/3/2011 -0800, Guido van Rossum wrote:

On Mon, Jan 3, 2011 at 3:13 PM, Jacob Kaplan-Moss  wrote:
> On Sun, Jan 2, 2011 at 9:21 AM, Guido van Rossum  wrote:
>> Although [PEP ] is still marked as draft, I personally think of it
>> as accepted; [...]
>
> What does it take to get PEP  formally marked as accepted? Is
> there anything I can do to push that process forward?
>
> The lack of a WSGI answer on Py3 is the main thing that's keeping me,
> personally, from feeling excited about the platform. Once that's done
> I can feel comfortable coding to it -- and browbeating those who don't
> support it.
>
> I understand that PEP 444/Web3/WSGI 2/whatever might be a better
> answer, but it's clearly got some way to go. In the meantime, what's
> next to get PEP  officially endorsed and accepted?

I haven't heard anyone speak up against it, ever, since it was
submitted. If no-one speaks up in the next 24 hours consider it
accepted (and after that delay, anyone with SVN privileges can mark it
thus).


There are a few minor changes to the code samples needed to make them 
proper Python 3; I just checked in the hairiest of them, though.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-02 Thread P.J. Eby

At 02:21 PM 1/2/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-02 11:57:19 -0800, P.J. Eby said:
* -1 on the key-specific encoding schemes for the various CGI 
variables' values -- for continuity of coding (not to mention 
simplicity) PEP 's approach to environ encodings should beused.
(That is, the environ consists of bytes-in-unicode-form, rather 
than true unicode strings.)


Does ISO-8859-1 not accomodate this for all but a small number of 
the environment variables in PEP 444?


PEP  requires that environment variables contain the bytes of the 
HTTP headers, decoded using ISO-8859-1.  The unicode strings, in 
other words are restricted to code points in the 0-255 range, and are 
really just a representation of bytes, rather than being a unicode 
decoding of the contents of the bytes.


What I saw in your draft of PEP 444 (which I admittedly may be 
confused about) is language that simply loosely refers to unicode 
environment variables, which could easily be misconstrued as meaning 
that the values could actually contain other code points.


To be precise, in PEP 333, the "true" unicode value of an environment 
variable is:


environ[key].encode('iso-8859-1').decode(appropriate_encoding_for_key)

Whereas, my reading of your current draft implies that this has to 
already be done by the server.


As I understand it, the problem with this is that the server 
developer can't always provide such a decoding correctly, and would 
require that the server guess, in the absence of any information that 
it could use to do the guessing.  An application developer is in a 
better position to deal with this ambiguity than the server 
developer, and adding configuration to the server just makes 
deployment more complicated, and breaks application composability if 
two sub-applications within a larger application need different decodings.


That's the rationale for the PEP  approach -- it essentially 
acknowledges that HTTP is bytes, and we're only using strings for the 
API conveniences they afford.





* Where is the PARAMETERS variable defined in the CGI spec?
Whatservers actually support it?


It's defined in the HTTP spec by way of 
http://www.ietf.org/rfc/rfc2396.txt URI Syntax.  These values are 
there, they should be available.  (Specifically semi-colon separated 
parameters and hash-separated fragment.)


I mean, what web servers currently provide PARAMETERS as a CGI 
variable?  If it's not a CGI variable, it doesn't go in all caps.


What's more, the spec you reference points out that parameters can be 
placed in *each* path-segment, which means that they would:


1) already be in PATH_INFO, and
2) have multiple values

So, -1 on the notion of PARAMETERS, since AFAICT it is redundant, not 
CGI, and would only hold one parameter segment.



* The language about empty vs. missing environment variables 
appears to have disappeared; without it, the spec is ambiguous.


I will re-examine the currently published PEP 444.


I don't know if it's in there or not; I've read your spec more 
thoroughly than that one.  I'm referring to the language from PEP 333 
and its successor, with which I'm much more intimately familiar.



Indeed.  I do try to understand the issues covered in a broader 
scope before writing; for example, I do consider the ability for new 
developers to get up and running without worrying about the example 
applications they are trying to use work in their version of Python; 
thus /allowing/ native strings to be used as response values on Python 3.


I don't understand.  If all the examples in your PEP use b'' strings 
(per the 2.6+ requirement), where is the problem?


They can't use WSGI 1(.0.1) code examples at all (as your draft isn't 
backward-compatible), so I don't see any connection there, either.




Byte strings are still perferred, and may be more performant,


Performance was not the primary considerations; they were:

* One Obvious Way
* Refuse The Temptation To Guess
* Errors Should Not Pass Silently

The first two would've been fine with unicode; the third was the 
effective tie-breaker.  (Since if you use Unicode, at some point you 
will send garbled data and end up with an error message far away from 
the point where the error occurred.)



I certainly will; I just need to see concrete points against the 
technical merits of the rewritten PEP


Well, I've certainly given you some, but it's hard to comment other 
than abstractly on an async spec you haven't proposed yet.  ;-)


Nonetheless, it's really important to understand that the PEP process 
(especially for Informational-track standards) is not so much about 
technical merits in an absolute sense, as it is about *community consensus*.


And that means it's actually a political and marketing process at 
least as much as it is a technical one.  If you miss

Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-02 Thread P.J. Eby

At 12:38 PM 1/2/2011 -0800, Alice Bevan­McGregor wrote:

On 2011-01-02 11:14:00 -0800, Chris McDonough said:
I'd suggest we just embrace it, adding minor tweaks as necessary, 
until we reach some sort of technical impasse it doesn't address.


Async is one area that  does not cover, and that by not having a 
standard which incorporates async means competing, incompatible 
solutions have been created.


Actually, it's the other way 'round.  I wrote off async for PEP 333 
because the competing, incompatible solutions that already existed 
lacked sufficient ground to build a spec on.


In effect, any direction I took would effectively have either blessed 
one async paradigm as the correct one, or else been a mess that 
nobody would've used anyway.


And this condition largely still exists today: about the only common 
ground between at least *some* async Python frameworks today is the 
use of greenlets...  but if you have greenlets, then you don't need a 
fancy async API in the first place, because you can just "block" 
during I/O from the POV of the app.


The existence of a futures API in the stdlib doesn't alter this much, 
either, because the async frameworks by and large already had their 
own API paradigms for doing such things (e.g. Twisted deferreds and 
thread/process pools, or generator/greenlet-based APIs in other frameworks).


The real bottleneck isn't even that, so much as that if you're going 
to write an async WSGI application, WSGI itself can't define enough 
of an async API to let you do anything useful.  For example, you may 
still need database access that's compatible with the async 
environment you're using...  so you'd only be able to write portable 
async applications if they didn't do ANY I/O outside of HTTP itself!


That being the case, I don't see how a meaningfully cross-platform 
async WSGI can ever really exist, and be attractive both to 
application developers (who want to run on many platforms) and 
framework developers (who want many developers to choose their platform).




On 2011-01-02 12:00:39 -0800, Guido van Rossum said:
Actually that does sound like an opinion on the technical merits. I 
can't tell though, because I'm not familiar enough with PEP 444 to 
know what the critical differences are compared to PEP . Could 
someone summarize?


Async, distinction between byte strings (type returned by 
socket.read), native strings, and unicode strings,


What distinction do you mean?  I see a difference in *how* you're 
distinguishing byte, native, and unicode strings, but not *that* 
they're distinguished from one another.  (i.e., PEP  
distinguishes them too.)



thorough unicode decoding (moving some of the work from middleware 
to the server),


What do you mean by "thorough decoding" and "moving from middleware 
to server"?  Are these references to the redundant environ variables, 
to the use of decoded headers (rather than bytes-in-unicode ones), or 
something else?



The async part is an idea in my head that I really do need to write 
down, clarified with help from agronholm on IRC.  The futures PEP is 
available as a pypi installable module, is core in 3.2, and seems to 
provide a simple enough abstract (duck-typed) interface that it 
should be usable as a basis for async in PEP 444.


I suggest reviewing the Web-SIG history of previous async 
discussions; there's a lot more to having a meaningful API spec than 
having a plausible approach.  It's not that there haven't been past 
proposals, they just couldn't get as far as making it possible to 
write a non-trivial async application that would actually be portable 
among Python-supporting asynchronous web servers.


(Now, if you're proposing that web servers run otherwise-synchronous 
applications using futures, that's a different story, and I'd be 
curious to see what you've come up with.  But that's not the same 
thing as an actually-asynchronous WSGI.)  


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-02 Thread P.J. Eby

At 01:47 AM 1/2/2011 -0800, Alice Bevan­McGregor wrote:
The only things that depress me in the slightest are the lack of 
current discussion on the Web-SIG list (I did post a thread 
announcing my rewrite and asking for input, but there were no takers)


FWIW, my lack of interest has been due to two factors.  First, the 
original version of PEP 444 before you worked on it was questionable 
in my book, due to the addition of new optional features (e.g. 
async), and second, when I saw your "filters" proposal, it struck me 
as more of the same, and put me off from investigating further.  (The 
whole idea of having a WSGI 2 (IMO at least), was to get rid of cruft 
and fix bugs, not to add new features.)


Reading the draft now (I had not done so previously), I would suggest 
making your filters proposal available as a standalone module (or an 
addition to a future version of wsgiref.util), for anybody who wants 
that feature, e.g. via:


  def filter_app(app, ingress=(), egress=()):
  def wrapped(environ):
  for f in ingress: f(environ)
  s, h, b = app(environ)
  for f in egress:
  s, h, b = f(environ, s, h, b)
  return s, h, b
  return wrapped

(Writing this implementation highlights one problem with the notion 
of filters, though, and that's that egress filters can't trap an 
error in the application.)


As far as the rest of the draft is concerned, in order to give 
*thorough* technical feedback, I'd need to first make sure that all 
of the important rules from  are still present, which is hard to 
do without basically printing out both PEPs and doing a line-by-line analysis.


I notice that file_wrapper is missing, for example, but am not clear 
on the rationale for its removal.  Is it your intention that servers 
wishing to support file iteration should check for a fileno() directly?


There are a number of minor errors in the draft, such as saying that 
__iter__ must return a bytes instance (it should return an iterator 
yielding bytes instances) and __iter__ itself has broken markup.


On other matters:

* I remain a strong -1 on the .script_name and .path_info variables 
(one of which is named incorrectly in the draft), for reasons 
outlined here previously.  (Specifically, that environ redundancy is 
a train wreck for middleware, which must be written to support both 
ways of providing the same variable, or to delete the extended 
version when modifying the environment.)


* -1 on the key-specific encoding schemes for the various CGI 
variables' values -- for continuity of coding (not to mention 
simplicity) PEP 's approach to environ encodings should be 
used.  (That is, the environ consists of bytes-in-unicode-form, 
rather than true unicode strings.)


* +1 to requiring Python 2.6 -- standardizing on b'' notation makes 
good sense and improves forward-portability to Python 3.


* Where is the PARAMETERS variable defined in the CGI spec?  What 
servers actually support it?


* The language about empty vs. missing environment variables appears 
to have disappeared; without it, the spec is ambiguous.


Those are the issues I have identified on a first reading, without 
doing a full analysis.  However, in lieu of such an analysis, I would 
take Graham's word on whether or not your spec addresses the HTTP 
compliance/implementation issues found in previous WSGI specs.


If I may offer a recommendation from previous experience steering 
this PEP, I would say that just because other people know (say) HTTP 
better than you, doesn't mean that you can't still make a better spec 
than they can.  You don't have to make *your* proposal into 
*Graham's* spec, or even the spec that Graham would have wanted you 
to make.  But you *do* need to take Graham's *goals* into consideration.


During the original drafting of PEP 333, Ian proposed quite a lot of 
features that I shot down...  but I modified what I was doing so that 
Ian could still achieve his *goals* within the spec, without 
compromising my core vision for the spec (which was that it should be 
as close to trivial for server implementers as possible, and not 
expose any application API that might be perceived by framework 
developers as competition).


So, I urge you to pay attention when Graham says something about what 
the spec is lacking or how it's broken.  You don't have to fix it 
*his* way, but you do need to fix such that he can still get 
there.  WSGI really does need some fixes for chunking and streaming, 
and Graham absolutely has the expertise in that area, and I would 
100% defer to his judgment on whether some proposal of mine got it 
right in that area.


That doesn't mean I would just adopt whatever he proposed, though, if 
it didn't meet *my* goals.


That's the hart part of making a spec, but it's also the part that 
makes one great.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscrib

Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-02 Thread P.J. Eby

At 05:04 PM 1/2/2011 +1100, Graham Dumpleton wrote:

That PEP was rejected in the end and was
replaced with PEP 342 which worked quite differently, yet I cant see
that the WSGI specification was revisited in light of how it ended up
being implemented and the implications of that.


Part of my contribution to PEP 342 was ensuring that it was 
sufficiently PEP 325-compatible to ensure that PEP 333 wouldn't 
*need* revisiting.  At least, not with respect to generator close() 
methods anyway.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Should PEP 3333 be Python 3-only? What about transcoding?

2010-11-03 Thread P.J. Eby
As I've been tidying up wsgiref in the stdlib for PEP , I've been 
noticing that there's a bit of an issue with the PEP as far as CGI variables.


Currently, the CGI example is the same as it is in PEP , which 
means that it's correct code for Python 2.x, but wrong for 3.x due to 
the environment transcoding issue.  (See 
http://bugs.python.org/issue10155 for details.)


There are other code sample differences, too.  In effect, PEP  is 
still using Python 2 code samples, because it's trying to cover every 
version of Python from 2.1 through 3.2.


Should we ditch that, and say, "hey, if you want Python 2.x code 
samples, go see PEP 333?"


That will simplify a couple of things, but still won't address the 
transcoding issue.


Specifically, the problem is that on Python 3, os.environ contains 
*unicode*, not bytes masquerading as unicode.  Unfortunately, this 
means that it very possibly contains garbage for CGI variables, as 
the web server puts bytes in the environment, then Python converts 
those bytes to unicode using the system encoding + surrogateescape.


To get back to bytes, then, we have to decode using the same 
combination, then re-encode with latin-1 to get back to a 
WSGI-compatible string.


The hitch is this: not everything in os.environ comes from an HTTP 
request, and therefore may not be decodable in such a fashion.  For 
example, if you decode TMP or HOME or even DOCUMENT_ROOT that way, 
you're going to get rubbish.


In wsgiref for the stdlib, I've used a variation of And Clover's 
patch in issue #10155 to implement something that *only* transcodes 
CGI variables that come from the web client request, but it's 
dreadfully complex.


This isn't really a problem in wsgiref, because as far as I know, 
nobody else has bothered to make another CGI WSGI runner besides the 
one in wsgiref, and the sample in the PEP.


But it is a problem for the PEP, because the complexity involved is 
high -- so high it would completely obscure the essential simplicity 
of the CGI example, if it was written in-line.


There are many possible ways to address this, but my current leaning is to:

1. Change the PEP  code samples to Python 3 only, and 
backreference PEP 333 for Python 2 code samples


2. Make the CGI sample in  do an indiscriminate transcode (which 
only takes a few lines) and add a note to indicate that a robust CGI 
implementation should only do it to CGI variables, suggesting the 
wsgiref.handlers.read_environ() code as an example.


Any thoughts?


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-23 Thread P.J. Eby

At 02:26 PM 10/23/2010 +0300, Armin Ronacher wrote:

Hi,

On 10/22/10 2:35 AM, Graham Dumpleton wrote:

has said:

   """Hopefully not. WSGI could do better and there is a proposal for
that (444)."""
Just to give this some more context: I think WSGI (even in Python 2) 
is faulty and we have the possibility now to fix it.  Nobody in the 
web community is really eager to use Python 3 currently as far as I 
can see, so we have some extra time where we can actually introduce 
some value in to web development on Python 3.  An improved WSGI 
specification could be a key to that.


If PEP  is what we end up with, that is fine with me as well.


I don't think it's an either-or case.  PEP  just means that 
there's a clear path to port WSGI 1 apps.  If somebody wants to 
champion a WSGI 1.1, a 2.0, and whatever else, that's great!


I'm really trying to step *down* from involvement in this; the only 
reason I stepped up to do this now is because of the pending 3.2 
release and the open question(s) over stdlib APIs that have to 
stabilize in this release.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-21 Thread P.J. Eby

At 10:35 AM 10/22/2010 +1100, Graham Dumpleton wrote:

Any one care to comment on my blog post?


http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html

As far as web framework developers commenting, Armin at:


http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/

has said:

  """Hopefully not. WSGI could do better and there is a proposal for
that (444)."""

So, looks he is very cool on the idea.

No other developers of actual web frameworks has commented at all on
PEP  from what I can see.

Graham


Just out of curiosity, Graham, isn't PEP  basically only a slight 
modification to what you yourself proposed and implemented in 
mod_wsgi for Python 3?


My guess is that there's been no comment because there's really not 
much to say about it.  The most controversial thing about it was 
Python-Dev's objection to modifying PEP 333 in place -- and that's 
the *only* reason why it's a new PEP at all.


(Indeed, I originally just made the discussed amendments to PEP 333, 
and specifically wanted to avoid having a new PEP number in order to 
create unnecessary additional discussion or questions.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgiref 0.2 dev in svn w/PEP 3333 support

2010-10-09 Thread P.J. Eby

At 09:37 PM 10/9/2010 +0200, And Clover wrote:

On 10/06/2010 07:21 PM, P.J. Eby wrote:


How would these relate to the Python 3.2 release? Can you make 3.x and
2.x versions?


Yes, I have separate fixup code paths for 2.x and 3.x. 3.x faces the 
reverse situation to that previously described, in that os.environ 
is accurate on Windows but needs reverse-decoding on POSIX. 
Currently I use utf-8 and surrogateescape, but for Python 3.2 
presumably os.environb will be the safer bet.


Ok; if you can submit patches against 
svn://svn.eby-sarna.com/svnroot/wsgiref (for 2.x) and 
http://svn.python.org/projects/python/branches/py3k/Lib/wsgiref (for 
3.x), adding an IISCGIHandler and whatever else, I'll review them and apply.


Note, by the way, that just because the environment is unicode on 
3.x, doesn't mean it's WSGI-correct: WSGI requires that unicode 
environment strings be just bytestrings in disguise.  It's actually 
an error if those environment strings contain any character greater than 255!


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgiref 0.2 dev in svn w/PEP 3333 support

2010-10-06 Thread P.J. Eby

At 01:28 PM 10/6/2010 +0200, And Clover wrote:

On 10/05/2010 04:23 AM, P.J. Eby wrote:


A preliminary update of the standalone (Python <3.x) version of wsgiref
is now available


Is there any interest in putting fixup code into wsgiref's 
CGIHandler? I appreciate this is really ugly, but the CGI-to-WSGI 
gateway is the most logical place for this, as otherwise the WSGI 
environment created by CGIHandler often doesn't meet the 
requirements of the spec.


Trying to fix these problems at an application, framework or 
middleware level is impractical because they don't know that the 
WSGI environ originally came from CGI. (And they can't re-read the 
environ at that point without breaking environ-altering middleware.)


In particular: for Python 2.x running on Win32, read the environment 
using ctypes where available, allowing non-ASCII characters to be 
read directly instead of irretrievably mangled by the 
ANSI-code-page-encoded os.environ interface. Then encode the 
extracted Unicode environ to byte strings using ISO-8859-1, except 
if the server software is Microsoft/IIS, where the encoding will 
probably be UTF-8.


IIS also needs a fix to remove the duplicated SCRIPT_NAME from the 
front of PATH_INFO. This is a bit more risky as existing 
apps/libraries may already be doing this and might get confused if 
someone's already done the fix. Maybe a subclass like IISCGIHandler?


How would these relate to the Python 3.2 release?  Can you make 3.x 
and 2.x versions?


(I currently consider getting 3.2 out a higher priority, and want 
equity between the standalone 0.2 and the bundled version in 3.2.) 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] wsgiref 0.2 dev in svn w/PEP 3333 support

2010-10-04 Thread P.J. Eby
A preliminary update of the standalone (Python <3.x) version of 
wsgiref is now available using "easy_install wsgiref==dev".  Relevant 
diffs are here:


  http://svn.eby-sarna.com/?rev=2689&view=rev

This is preliminary work that'll need to be ported to the Python 3 
version of wsgiref (note that the standalone version is *not* 2to3 
friendly as yet), but I wanted to get the basic implementation done 
before porting.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 3333 (WSGI 1.0.1) - Now updated with wsgi.org amendments

2010-10-04 Thread P.J. Eby

At 12:04 PM 10/4/2010 -0400, P.J. Eby wrote:
PEP  has now been updated with the amendments and clarifications 
that I announced two weeks ago; see this link for the (nicely 
formatted) differences between PEP 333 and PEP :


  http://svn.python.org/view/peps/trunk/pep-.txt?r1=85014&r2=HEAD


Clarification: the above will only show you the amendments *other* 
than the Python 3 changes.  For the FULL diff between 333 and 333, see:


  http://svn.python.org/view/peps/trunk/pep-.txt?r1=84854&r2=HEAD

Sorry for any confusion.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] PEP 3333 (WSGI 1.0.1) - Now updated with wsgi.org amendments

2010-10-04 Thread P.J. Eby
PEP  has now been updated with the amendments and clarifications 
that I announced two weeks ago; see this link for the (nicely 
formatted) differences between PEP 333 and PEP :


  http://svn.python.org/view/peps/trunk/pep-.txt?r1=85014&r2=HEAD

Or this link for the full PEP:

  http://www.python.org/dev/peps/pep-/

Now is the time for any error corrections, objections, nitpicking, 
volunteering to help update wsgiref, etc.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-27 Thread P.J. Eby

At 01:22 PM 9/27/2010 -0400, Terry Reedy wrote:

On 9/26/2010 9:38 PM, P.J. Eby wrote:

At 11:15 AM 9/27/2010 +1000, Ben Finney wrote:



You misunderstand me; I wasn't asking how to *add* a link, but how to
turn OFF the automatic conversion of the phrase "PEP 333" that happens
without any special markup.



Currently, the PEP  preface is littered with unnecessary links,
because the PEP pre-processor turns *every* mere textual mention of a
PEP into a link to it.


Ouch. This is about as annoying as Thunderbird's message editor 
popping up a windowed asking me what file I want to at.tach 
everytime I write the word "at-tach' or a derivative without the 
extra punctuation. It would definitely not be the vehicle for 
writing about at=mentment syndromes.


Suggestion pending something better from rst/PEP experts:
"This PEP extends PEP 333 (abbreviated P333 hereafter)."
perhaps with "to avoid auto-link creation" added before ')' to 
pre-answer pesky questions and to avoid some editor re-expanding the 
abbreviations.


It turns out that using a backslash before the number (e.g. PEP \333) 
turns off the automatic conversion.


The PEP still hasn't showed up on Python.org, though, so I'm 
wondering if maybe I broke something else somewhere.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI is now Python 3-friendly

2010-09-26 Thread P.J. Eby

At 11:15 AM 9/27/2010 +1000, Ben Finney wrote:

"P.J. Eby" <<http://mail.python.org/mailman/listinfo/python-dev>pje 
at telecommunity.com> writes:


> (For that matter, if anybody knows how to make it not turn *every* PEP
> reference into a link, that'd be good too! It doesn't really need to
> turn 5 or 6 occurrences of "PEP 333" in the same paragraph into
> separate links. ;-) )

reST, being designed explicitly for Python documentation, has support
for PEP references built in:



You misunderstand me; I wasn't asking how to *add* a link, but how to 
turn OFF the automatic conversion of the phrase "PEP 333" that 
happens without any special markup.


Currently, the PEP  preface is littered with unnecessary links, 
because the PEP pre-processor turns *every* mere textual mention of a 
PEP into a link to it.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-26 Thread P.J. Eby

At 02:59 PM 9/26/2010 -0400, Terry Reedy wrote:
You could mark added material is a way that does not conflict with 
rst or html. Or use .rst to make new text stand out in the .html web 
verion (bold, underlined, red, or whatever). People familiar with 
333 can focus on the marked sections. New readers can ignore the marking.


If you (or anybody else) have any idea how to do that (highlight 
stuff in PEP-dialect .rst), let me know.


(For that matter, if anybody knows how to make it not turn *every* 
PEP reference into a link, that'd be good too!  It doesn't really 
need to turn 5 or 6 occurrences of "PEP 333" in the same paragraph 
into separate links.  ;-) )


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-26 Thread P.J. Eby
Done.  The other amendments were never actually made, so I just 
reverted the Python 3 bit after moving it to the new PEP.  I'll make 
the changes to  instead as soon as I have another time slot free.


At 01:56 PM 9/26/2010 -0700, Guido van Rossum wrote:

Since you have commit privileges, just do it. The PEP editor position
mostly exists to assure non-committers are not prevented from
authoring PEPs.

Please do add a prominent note at the top of PEP 333 pointing to PEP
 for further information on Python 3 compliance or some such
words. Add a similar note at the top of PEP  -- maybe mark up the
differences in PEP  so people can easily tell what was added. And
move PEP 333 to Final status.

--Guido

On Sun, Sep 26, 2010 at 1:50 PM, P.J. Eby  wrote:
> At 01:44 PM 9/26/2010 -0700, Guido van Rossum wrote:
>>
>> On Sun, Sep 26, 2010 at 12:47 PM, Barry Warsaw  wrote:
>> > On Sep 26, 2010, at 1:33 PM, P.J. Eby wrote:
>> >
>> >> At 08:20 AM 9/26/2010 -0700, Guido van Rossum wrote:
>> >>> I'm happy approving Final status for the
>> >>> *original* PEP 333 and I'm happy to approve a new PEP which includes
>> >>> PJE's corrections.
>> >>
>> >> Can we make it PEP , then?  ;-)
>> >
>> > That works for me.
>>
>> Go for it.
>
> Shall I just "svn cp" it, then (to preserve edit history), or wait for the
> PEP editor do it?
>
>



--
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
python-...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-26 Thread P.J. Eby

At 01:44 PM 9/26/2010 -0700, Guido van Rossum wrote:

On Sun, Sep 26, 2010 at 12:47 PM, Barry Warsaw  wrote:
> On Sep 26, 2010, at 1:33 PM, P.J. Eby wrote:
>
>> At 08:20 AM 9/26/2010 -0700, Guido van Rossum wrote:
>>> I'm happy approving Final status for the
>>> *original* PEP 333 and I'm happy to approve a new PEP which includes
>>> PJE's corrections.
>>
>> Can we make it PEP , then?  ;-)
>
> That works for me.

Go for it.


Shall I just "svn cp" it, then (to preserve edit history), or wait 
for the PEP editor do it?


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-26 Thread P.J. Eby

At 08:20 AM 9/26/2010 -0700, Guido van Rossum wrote:

I'm happy approving Final status for the
*original* PEP 333 and I'm happy to approve a new PEP which includes
PJE's corrections.


Can we make it PEP , then?  ;-)

That number would at least communicate that it's the same thing, but 
for Python 3.


Really, my reason for trying to do the (non Py3-specific) amendments 
in a way that didn't require a new PEP number was because of the many 
ancillary questions that it raises for the community, such as:


* Is this is some sort of competition/replacement to PEP 444?
* What happened to the old one, why can't we just use that?
* Why isn't there a different protocol version?
* How is this different from the old one?

To be fair, I *also* wanted to avoid all the work associated with 
*answering* them.  ;-)  (Heck, I really wanted to avoid the work of 
having to even *think* about which questions *might* arise and how 
they'd need to be addressed.)


OTOH, I can certainly see that my attempt to avoid this has *already* 
failed: it simply brought up a different set of questions, just on 
Python-Dev instead of Web-SIG or Python-list.


Oh well.  Perhaps making the numbering appear to be a continuation 
will help a bit.


Another option would be to make a PEP that consists solely of the 
amendments and errata themselves, as this would answer most of the 
above questions directly.


Still another would be to abandon the effort to amend the PEP, and 
simply leave things as they are now: AFAICT, the fact that these 
amendments aren't in the PEP hasn't stopped anybody from *treating* 
most of them as if they were.  (Because everyone understands that 
failure to follow them constitutes a bug in your program, even if it 
technically complies with the spec.)



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-26 Thread P.J. Eby

At 07:15 PM 9/25/2010 -0700, Guido van Rossum wrote:

Don't see this as a new spec. See it as a procedural issue.


As a procedural issue, PEP 333 is an Informational PEP, in "Draft" 
status, which I'd like to make Final after these amendments.  See 
http://www.wsgi.org/wsgi/Amendments_1.0, which Graham created in 2007, stating:


"""This page is intended to collect any ideas related to amendments 
to the original WSGI 1.0 so that it can be marked as 'Final'."""


IOW, there is no intention to treat the PEP as "mutable" going 
forward; this is just cleanup so we can mark it Final.  After that, 
it's an ex-parrot.




Clarifications of ambiguous/unspecified behavior can possibly rule as
non-conforming implementations that used to get the benefit of the
doubt. Best-practice recommendations also have the effect of changing
(perceived) compliance.


I understand the general principle, but with respect to these 
*specific* changes, any perceived-compliance arguments that were 
going to happen, already happened years ago.  The changes are merely 
to officially document the way those arguments already turned out, so 
the PEP can become Final.


Specifically, the changes all fall into one of three categories:

1. Textual clarification (SERVER_PORT is not an int, iteration can 
stop before all output is consumed)


2. Practical issues with wsgi.input arising from the fact that 
real-world programs needed its behavior to be more "file-like" than 
the specification required...  and which essentially forced servers 
that were not using socket.makefile() to make their emulations work 
like that, anyway (or else be rejected by users).


3. Clarification of behavior that would break HTTP compliance (apps 
or servers sending more than Content-Length bytes) and is therefore 
*already a bug* in any implementation that does it.


Since in all three categories any implementation that did not end up 
following the recommendations on its own is going to have been 
considered buggy by its users (regardless of its formal 
"compliance"), and because the changes do not actually declare the 
buggy behaviors in categories 2 and 3 to be non-compliant, I do not 
see how any of these changes can produce the type of problems you're 
worried about here.


Certainly, if I thought such problems were possible, I wouldn't have 
accepted these amendments.  Likewise, if I thought that changes would 
continue to be made to the PEP past this point, the goal wouldn't be 
getting it to Final status.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-25 Thread P.J. Eby

At 02:07 PM 9/25/2010 -0700, Guido van Rossum wrote:

This is a very laudable initiative and I approve of the changes -- but
I really think it ought to be a separate PEP rather than pretending it
is just a set of textual corrections on the existing PEP 333.


With the exception of the bytes change, I ruled out accepting any 
proposed amendments that would actually alter the protocol.  The 
amendments are all either textual clarifications, clarifications of 
ambiguous/unspecified areas, or best-practice recommendations by 
implementors.  (i.e., which are generally already implemented in major servers)


The full list of things Graham and others have asked for or 
recommended would indeed require a 1.1 version at minimum, and thus a 
new PEP.  But I really don't want to start down that road right now, 
and therefore hope that I can talk Graham or some other poor soul 
into shepherding a 1.1 PEP instead.  ;-)


(Seriously: through an ironic twist of fate, I have done nearly 
*zero* Python web programming since around the time I drafted the 
first spec in 2004, so even if it makes sense for me to finish PEP 
333, it makes little sense for me to be starting a *new* one on the topic now!)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly

2010-09-25 Thread P.J. Eby

At 09:22 PM 9/25/2010 -0400, Jesse Noller wrote:

It seems like it will end up
different enough to be a different specification, closely related to
the original, but different enough to trip up all the people
maintaining current WSGI servers and apps.


The only actual *change* to the spec is mandating the use of the 
'bytes' type or equivalent for HTTP bodies when using Python 3.


Seriously, that's *it*.

Everything else that's (planned to be) added is either 100% truly 
just clarifications (e.g. nothing in the spec *ever* said SERVER_PORT 
could be an int, but apparently some people somehow interpreted it 
so), or else best-practice recommendations from people who actually 
implemented WSGI servers.


For example, the readline() size hint is "not supported" in the 
original spec (meaning clients can't call it and be compliant).  The 
planned modification is "servers should implement it" (best 
practice), but you can't call an implementation that *doesn't* 
implement it noncompliant.  (This just addresses the fact that most 
practical implementations *did* in fact support it, and code out 
there relies on this.)


So, no (previously-)compliant implementations were harmed in the 
making of the updated spec.  If they were compliant before, they're 
compliant now.


I'm actually a bit surprised people are bringing this up now, since 
when I announced the plan to make these changes, I said that nothing 
would be changed that would break anything...  even for what I 
believe are the only Python 3 WSGI implementations right now (by 
Graham Dumpleton and Robert Brewer).


Indeed, all of the changes (except the bytes thing) are stuff 
previously discussed endlessly on the Web-SIG (years ago in most 
cases) and widely agreed on as, "this should have been made clear in 
the original PEP".


And, I also explicitly deferred and/or rejected items that *can't* be 
done in a 100% backward-compatible way, and would have to be WSGI 1.1 
or higher -- indeed, I have a long list of changes from Graham that 
I've pronounced "can't be done without a 1.1".


Indeed, the entire point of the my scope choices were to allow all 
this to happen *without* a whole new spec.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI is now Python 3-friendly

2010-09-25 Thread P.J. Eby
I have only done the Python 3-specific changes at this point; the 
diff is here if anybody wants to review, nitpick or otherwise comment:


  
http://svn.python.org/view/peps/trunk/pep-0333.txt?r1=85014&r2=85013&pathrev=85014

For that matter, if anybody wants to take a crack at updating Python 
3's wsgiref based on the above, feel free.  ;-)  I'll be happy to 
answer any questions I can that come up in the process.


(Please note: I went with Ian Bicking's "headers are strings, bodies 
are bytes" proposal, rather than my original "bodies and outputs are 
bytes" one, as there were not only some good arguments in its favor, 
but because it also resulted in fewer changes to the PEP, especially 
in the code samples.)


I will continue to work on adding the other addenda/errata mentioned here:

  http://mail.python.org/pipermail/web-sig/2010-September/004655.html

But because these are "shoulds" rather than musts, and apply to both 
Python 2 and 3, they are not as high priority for immediate 
implementation in wsgiref and do not necessarily need to hold up the 
3.2 release.


(Nonetheless, if anybody is willing to implement them in the Python 3 
version, I will happily review the changes for backport into the 
Python 2 standalone version of wsgiref, and issue an updated release 
to include them.)


Thanks!

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications

2010-09-24 Thread P.J. Eby

At 01:22 PM 9/24/2010 +0200, René Dudfield wrote:

Hi,

Have all the changes been tested with real world implementations?


mod_wsgi under Python 3 is compliant with the changes, and I believe 
it has all the general addenda/clarifications implemented under 
Python 2 as well (and for some years now, in fact).


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications

2010-09-24 Thread P.J. Eby

At 09:52 AM 9/24/2010 -0600, Jeff Hardy wrote:

On Thu, Sep 23, 2010 at 10:32 AM, P.J. Eby  wrote:
> Just a reminder: I'm planning to actually update PEP 333 over the weekend
> and start working on wsgiref updates, so if you have any last-minute
> comments on the proposal, now's the time to post them, however unpolished
> they may be!

Will you bump the version number to 1.1, or will it stay at 1.0? Does
anyone actually check the version number?


Since these are just clarifications to the existing spec, and no 
previously-compliant implementations are invalidated by the changes, 
there will be no changes to the version number.





- Jeff


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications

2010-09-24 Thread P.J. Eby

At 01:45 PM 9/24/2010 +0200, Manlio Perillo wrote:

Il 23/09/2010 18:32, P.J. Eby ha scritto:
> Just a reminder: I'm planning to actually update PEP 333 over the
> weekend and start working on wsgiref updates, so if you have any
> last-minute comments on the proposal, now's the time to post them,
> however unpolished they may be!
>

Where can I find a draft of the update?


See 
http://mail.python.org/pipermail/web-sig/2010-September/004655.html 
for the notes; I have not updated the PEP yet, but am about to.


One change since that post: Ian has convinced me to make headers text 
and bodies bytes, where before I proposed to only have input headers 
be text, and output headers be bytes.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)

2010-09-24 Thread P.J. Eby

At 03:48 PM 9/23/2010 -0500, Ian Bicking wrote:
It only fixes the one case of non-Latin1 characters, there are still 
many other values you can put into a header (a newline or control 
character for instance), and innumerable header-specific 
issues.  It seems to be adding complexity for one of the least 
problematic cases.


Ok, you found one that convinces me.  ;-)  "Headers are text, bodies 
are bytes" shall be the rule.  I'll rewrite the "note about string 
types" and change the way I'm updating the spec accordingly.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)

2010-09-23 Thread P.J. Eby

At 11:17 AM 9/23/2010 -0500, Ian Bicking wrote:
I don't see any reason why Location shouldn't be ASCII.  Any header 
could have any character put in it, of course, there's just no valid 
case where Location shouldn't be a URL, and URLs are ASCII.  Cookie 
can contain weirdness, yes.  I would expect any library that 
abstracts cookies to handle this (it's certainly doable)... 
otherwise, this seems like one among many ways a person can do the wrong thing.


This can also be detected with the validator, which doesn't avoid 
runtime errors, but bytes allow runtime errors too -- they will just 
happen somewhere else (e.g., when a value is converted to bytes in 
an application or library).


Right: somewhere much closer to the *actual* error, where the 
developer can know the problem is, "I have garbage data or have not 
selected an appropriate codec", rather than "this WSGI stuff is 
giving me errors some place".



If servers print the invalid value on error (instead of just some 
generic error) I don't think it would be that hard to track down 
problems.  This requires some explicit effort on the part of the 
server (most servers handle app_iter==None ungracefully, which is a 
similar problem).


The difference is that if a server rejects non-bytes, you'll know 
*right away* that your app isn't compliant, instead of having to wait 
until some non-latin1 data shows up.


AFAICT, there are only two advantages to using text for output headers:

1. Text is easier to work with, and
2. It's symmetric with using text for input headers.

Both of which can still be had, by using the @encode_headers decorator.

I'm a little bit on the fence on this one, because 1) it does seem a 
little pointless (if harmless) to shuffle headers around in bytes 
form, and 2) Location and Set-Cookie are very likely the only headers 
where any kind of damage could ever happen.


But, since it *can* happen, and because it is also really easy to fix 
the API issue with a decorator, I'm still leaning in favor of "output 
is bytes" over "headers are text, bodies are bytes", unless somebody 
can come up with either some actually-bad consequence of using bytes, 
or some extra-good consequence of using text (that isn't addressed by 
just using the decorator).


(Note, by the way, that WSGI design has always leaned in the 
direction of "any convenience that can be handled by a library should 
be", if it keeps the spec simpler and more verifiable.  So, this 
seems like a good use of that principle.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications

2010-09-23 Thread P.J. Eby

At 02:51 PM 9/23/2010 -0400, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

P.J. Eby wrote:

> Just a reminder: I'm planning to actually update PEP 333 over the
> weekend and start working on wsgiref updates, so if you have any
> last-minute comments on the proposal, now's the time to post them,
> however unpolished they may be!

I'm fine with the substance of the changes you proposed, but puzzled
about the process:  in what case does it work to updated an
already-approved-and-implemented PEP would be updated, instead of
replacing it with a newer PEP (e.g., PEPs 241 -> 314 -> 345).


In the case where one is clarifying ambiguities/questions in the 
original spec.  ;-)


(None of the changes invalidate existing implementations, but simply 
provide additional guidance/best practice suggestions.  Even the 
Python 3 changes won't invalidate at least mod_wsgi's Python 3 implementation.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)

2010-09-23 Thread P.J. Eby

At 11:11 AM 9/23/2010 -0600, Jeff Hardy wrote:

On Thu, Sep 23, 2010 at 10:06 AM, P.J. Eby  wrote:
> So, unless somebody has some additional arguments on this one, I think I'm
> going to stick with bytes output.

I don't have a strong opinion on whether it should be bytes or strings
-- I'll leave that discussion for people who know more about the
details than I do.

I do think input and output should be symmetric, though. If response
headers are going to be bytes, then the request headers should be as
well, or vice versa. The same arguments apply to both, after all.


Actually, they don't.  There are more apps than servers, so more code 
to get right, by more people.  Servers also don't generally *create* 
any of the bytes or text involved, they're just ferrying it from one 
place to the next.  So the API conditions are not symmetrical.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Last call for WSGI 1.0 errata/clarifications

2010-09-23 Thread P.J. Eby
Just a reminder: I'm planning to actually update PEP 333 over the 
weekend and start working on wsgiref updates, so if you have any 
last-minute comments on the proposal, now's the time to post them, 
however unpolished they may be!


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)

2010-09-23 Thread P.J. Eby

At 12:57 PM 9/21/2010 -0400, Ian Bicking wrote:
On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby 
<<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote:

The Python 3 specific changes are to use:

* ``bytes`` for I/O streams in both directions
* ``str`` for environ keys and values
* ``bytes`` for arguments to start_response() and write()


This is the only thing that seems odd to me -- it seems like the 
response should be symmetric with the request, and the request in 
this case uses str for headers (status being header-like), and bytes 
for the body.


So, I've given some thought to your suggestion, and, while it's true 
that most of the output headers are far less prone to ending up with 
unintended unicode content, there are at least two output headers 
that can include some sort of application content (and can therefore 
have random failures): Location and Set-Cookie.


If these headers accidentally contain non-Latin1 characters, the 
error isn't detectable until the header reaches the origin server 
doing the transmission encoding, and it'll likely be a dynamic (and 
therefore hard-to-debug) error.


However, if the output is always bytes (and this can be 
relatively-statically verified), then any error can't occur except 
*inside* the application, where the app's developer can find it more easily.


So I guess the question boils down to: would we rather make sure that 
coding errors happen *inside* applications, or would we rather make 
porting WSGI apps trivial (or nearly so)?


But I think that it's possible here to have one's cake and eat it 
too: if we require bytes for all outputs, but provide a pair of 
decorators in wsgiref.util like the following:


def encode_body(codec='utf8'):
"""Allow a WSGI app to output its response body as strings 
w/specified encoding"""

def decorate(app):
def encode(response):
try:
for data in response:
yield data.encode(codec)
finally:
if hasattr(response, 'close'):
response.close()
def decorated_app(environ, start_response):
def start(status, response_headers, exc_info=None):
_write = start_response(status, 
response_headers, exc_info)

def write(data):
return _write(data.encode(codec))
return write
return encode(app(environ, start))
return decorated_app
return decorate

def encode_headers(codec='latin1'):
"""Allow a WSGI app to output its headers as strings, 
w/specified encoding"""

def decorate(app):
def decorated_app(environ, start_response):
def start(status, response_headers, exc_info=None):
status = status.encode(codec)
response_headers = [
(k.encode(codec), v.encode(codec)) for k,v 
in response_headers

]
return start_response(status, response_headers, exc_info)
return app(environ, start)
return decorated_app
return decorate

So, this seems like a win-win to me: relatively-static verification, 
errors stay in the app (or at least in the decorator), and the API is 
clean-and-easy.  Indeed, it seems likely that at least some apps that 
don't read wsgi.input themselves could be ported *just* by adding the 
appropriate decorator(s).  And, if your app is using unicode on 2.x, 
you can even use the same decorators there, for the benefit of 
2to3.  (Assuming I release an updated standalone wsgiref version with 
the decorators, of course.)


So, unless somebody has some additional arguments on this one, I 
think I'm going to stick with bytes output.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Most WSGI servers close connections to early.

2010-09-22 Thread P.J. Eby

At 08:34 AM 9/22/2010 -0700, Robert Brewer wrote:

Marcel Hellkamp wrote:
> I would like to add a warning to the WSGI/web3 specification to address
> this issue:
>
> "An application should read all available data from
> `environ['wsgi.input']` on POST or PUT requests, even if it does not
> process that data. Otherwise, the client might fail to complete the
> request and not display the response."

Indeed. CherryPy has protected against this for some time. But it 
shouldn't be the burden of *applications* to do this; the WSGI 
"origin" server can do so quite easily.


However, the caveat requires a caveat: servers must still be able to 
protect themselves from malicious clients. In practice, that means 
allowing servers to close the connection without reading the entire 
request body if a certain number of bytes is exceeded.


We can certainly add warnings, although these are both more of a 
"best practices" advisory rather than a part of the spec per se.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread P.J. Eby

[trimming reply headers to just web-sig]

At 12:57 PM 9/21/2010 -0400, Ian Bicking wrote:
On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby 
<<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote:

The Python 3 specific changes are to use:

* ``bytes`` for I/O streams in both directions
* ``str`` for environ keys and values
* ``bytes`` for arguments to start_response() and write()


This is the only thing that seems odd to me -- it seems like the 
response should be symmetric with the request, and the request in 
this case uses str for headers (status being header-like), and bytes 
for the body.


Are you suggesting a "``str`` for headers, ``bytes`` for bodies" 
approach instead?


I suppose that could work; I was going for "str in, bytes out".  My 
assumption, though, was that headers are relatively easy to address 
at a choke point from a framework's output.  But I guess that 
iterator output is equally chokable.


I'm open to discussion on this point, so long as every value produced 
or consumed by a WSGI application is of a specified single type().



Otherwise this seems good to me, the only other major errata I can 
think of are all listed in the links you included.


Um, if by "links" you mean, "included textually in the proposal", 
then sure.  If it's not in the proposal, it's not going in the PEP, 
even if it's on the WSGI Amendments page or Graham's blog.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread P.J. Eby

At 06:52 PM 9/21/2010 +0200, Antoine Pitrou wrote:

On Tue, 21 Sep 2010 12:09:44 -0400
"P.J. Eby"  wrote:
> While the Web-SIG is trying to hash out PEP 444, I thought it would
> be a good idea to have a backup plan that would allow the Python 3
> stdlib to move forward, without needing a major new spec to settle
> out implementation questions.

If this allows the Web situation in Python 3 to be improved faster
and with less hassle then all the better.
There's something strange in your proposal: it mentions WSGI 2 at
several places while there's no guarantee about what WSGI 2 will be (is
there?).


Sorry - "WSGI 2" should be read as shorthand for, "whatever new spec 
succeeds PEP 333", whether that's PEP 444 or something else.


It just means that any new spec that doesn't have to be 
backward-compatible can (and should) more thoroughly address the 
issue in question. 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread P.J. Eby

At 12:55 PM 9/21/2010 -0400, Ian Bicking wrote:
On Tue, Sep 21, 2010 at 12:47 PM, Chris McDonough 
<<mailto:chr...@plope.com>chr...@plope.com> wrote:

On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote:
> While the Web-SIG is trying to hash out PEP 444, I thought it would
> be a good idea to have a backup plan that would allow the Python 3
> stdlib to move forward, without needing a major new spec to settle
> out implementation questions.

If a WSGI-1-compatible protocol seems more sensible to folks, I'm
personally happy to defer discussion on PEP 444 or any other
backwards-incompatible proposal.


I think both make sense, making WSGI 1 sensible for Python 3 (as 
well as other small errata like the size hint) doesn't detract from 
PEP 444 at all, IMHO.


Yep.  I agree.  I do, however, want to get these amendments settled 
and make sure they get carried over to whatever spec is the successor 
to PEP 333.  I've had a lot of trouble following exactly what was 
changed in 444, and I'm a tad worried that several new ambiguities 
may be being introduced.  So, solidifying 333 a bit might be helpful 
if it gives a good baseline against which to diff 444 (or whatever).


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread P.J. Eby
While the Web-SIG is trying to hash out PEP 444, I thought it would 
be a good idea to have a backup plan that would allow the Python 3 
stdlib to move forward, without needing a major new spec to settle 
out implementation questions.


After all, even if PEP 333 is ultimately replaced by PEP 444, it's 
probably a good idea to have *some* sort of WSGI 1-ish thing 
available on Python 3, with bytes/unicode and other matters settled.


In the past, I was waiting for some consensuses (consensi?) on 
Web-SIG about different approaches to Python 3, looking for some sort 
of definite, "yes, we all like this" response.  However, I can see 
now that this just means it's my fault we don't have a spec yet.:-(


So, unless any last-minute showstopper rebuttals show up this week, 
I've decided to go ahead officially bless nearly all of what Graham 
Dumpleton (who's not only the mod_wsgi author, but has put huge 
amounts of work into shepherding WSGI-on-Python3 proposals, WSGI 
amendments, etc.) has proposed, with a few minor exceptions.


In other words: almost none of the following is my own original work; 
it's like 90% Graham's.  Any praise for this belongs to him; the only 
thing that belongs to me is the blame for not doing this 
sooner!  (Sorry Graham.  You asked me to do this ages ago, and you were right.)


Anyway, I'm posting this for comment to both Python-Dev and the 
Web-SIG.  If you are commenting on the technical details of the 
amendments, please reply to the Web-SIG only.  If you are commenting 
on the development agenda for wsgiref or other Python 3 library 
issues, please reply to Python-Dev only.  That way, neither list will 
see off-topic discussions.  Thanks!



The Plan


I plan to update the proposal below per comments and feedback during 
this week, then update PEP 333 itself over the weekend or early next 
week, followed by a code review of Python 3's wsgiref, and 
implementation of needed changes (such as recoding os.environ to 
latin1-captured bytes in the CGI handler).


To complete the changes, it is possible that I may need assistance 
from one or more developers who have more Python 3 experience.  If 
after reading the proposed changes to the spec, you would like to 
volunteer to help with updating wsgiref to match, please let me know!



The Proposal



Overview


1. The primary purpose of this update is to provide a uniform porting 
pattern for moving Python 2 WSGI code to Python 3, meaning a pattern 
of changes that can be mechanically applied to as little code as 
practical, while still keeping the WSGI spec easy to programmatically 
validate (e.g. via ``wsgiref.validate``).


The Python 3 specific changes are to use:

* ``bytes`` for I/O streams in both directions
* ``str`` for environ keys and values
* ``bytes`` for arguments to start_response() and write()
* text stream for wsgi.errors

In other words, "strings in, bytes out" for headers, bytes for bodies.

In general, only changes that don't break Python 2 WSGI 
implementations are allowed.  The changes should also not break 
mod_wsgi on Python 3, but may make some Python 3 wsgi applications 
non-compliant, despite continuing to function on mod_wsgi.


This is because mod_wsgi allows applications to output string headers 
and bodies, but I am ruling that option out because it forces every 
piece of middleware to have to be tested with arbitrary combinations 
of strings and bytes in order to test compliance.  If you want your 
application to output strings rather than bytes, you can always use a 
decorator to do that.  (And a sample one could be provided in wsgiref.)



2. The secondary purpose of the update is to address some 
long-standing open issues documented here:


   http://www.wsgi.org/wsgi/Amendments_1.0

As with the Python 3 changes, only changes that don't retroactively 
invalidate existing implementations are allowed.



3. There is no tertiary purpose.  ;-)  (By which I mean, all other 
kinds of changes are out-of-scope for this update.)



4. The section below labeled "A Note On String Types" is proposed for 
verbatim addition to the "Specification Overview" section in the PEP; 
the other sections below describe changes to be made inline at the 
appropriate part of the spec, and changes that were proposed but are 
rejected for inclusion in this amendment.



A Note On String Types
--

In general, HTTP deals with bytes, which means that this 
specification is mostly about handling bytes.


However, the content of those bytes often has some kind of textual 
interpretation, and in Python, strings are the most convenient way to 
handle text.


But in many Python versions and implementations, strings are Unicode, 
rather than bytes.  This requires a careful balance between a usable 
API and correct translations between bytes and text in the context of 
HTTP...  especially to support porting code between Python 
implementations with different ``str`` types.


WSGI theref

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-18 Thread P.J. Eby

At 09:01 AM 9/18/2010 -0700, Robert Brewer wrote:

Marcel Hellkamp wrote:
>
> Removing any support for this type of asynchronism would render web3
> useless for all but completely synchronous and trivial applications.
> Even frameworks would have no way to work around this anymore.

I've run a few businesses now on WSGI without doing what you 
describe, so I don't see why blocking makes an application 'trivial'.


I believe he means:  all_but(synchronous_apps + trivial_apps), not 
all_but(apps(synchronous & trivial)).  ;-)


(That being said, for WSGI 2 I still want to get rid of 
start_response.  IMO, async WSGI needs to be a different protocol.) 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-17 Thread P.J. Eby

At 03:43 PM 9/17/2010 +0200, And Clover wrote:

On 09/17/2010 02:03 PM, Armin Ronacher wrote:


In case we change the spec as Ian mentioned above, I am all for
a "wsgi.guessed_encoding" = True flag or something like that.


Yes, I'd like to see that. I believe going with *only* a 
raw-or-reconstructed path_info, rather than having both path_info 
and PATH_INFO, is probably best, for the middleware-dupication 
reasons PJE mentioned.


A more in-depth possibility might be:

wsgi.path_accuracy =

0: script_name/path_info have been crudely reconstructed from
SCRIPT_NAME/PATH_INFO from an unknown source. Beware!
If there is to be backwards compatibility with WSGI1, this
would be seen as the 'default value' given a missing path_accuracy.

1: script_name/path_info have been reconstructed, but it is known
that path_info is accurate, other than %2F and non-ASCII issues.
That is, it's known that the path doesn't come from IIS's broken
PATH_INFO, or the IIS error has been detected and compensated for.

2: script_name/path_info have been reconstructed using known-good
encodings for the env. The only way in which they may differ from
the original request path is that a slash might originally have
been a %2F. (This is good enough for the vast majority of
applications.)

3: script_name/path_info come directly from the request path
without any intervening mangling.


So, do you have an example of what some real-world code is going to 
*do* with this information?  i.e., what's the use case for knowing 
the precise degree of messed-uppedness of the path?  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread P.J. Eby

At 02:54 PM 9/16/2010 -0400, Chris McDonough wrote:

On Thu, 2010-09-16 at 14:04 -0400, P.J. Eby wrote:
> At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote:
> >No comments on the rest except to note that at this point it looks
> >unlikely that we can make everyone happy (or even get an agreement to
> >adopt what would be the long-term technically optimal solution --
> >AFAICT there is no agreement on what that solution would be, if one
> >weren't to take porting Python 2 code into account). IOW
> >something/sokebody has gotta give.
>
> Indeed.  This entire discussion has pushed me strongly in favor of
> doing a super-minimalist update to PEP 333 with the following points:

Right on, write it all down! ;-)


I thought I just did.  ;-)

Okay, I will carve out some cycles.

(Btw, it appears that somebody has recently hacked on the code in PEP 
333 and inadvertently broken the specification, so I'll be fixing that first.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread P.J. Eby

At 02:17 PM 9/16/2010 -0500, Ian Bicking wrote:
On Thu, Sep 16, 2010 at 1:04 PM, P.J. Eby 
<<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote:
* Clarifying the encoding of environ values (locale+surrogateescape 
vs. latin1, TBD)



locale+surrageescape would be insanity!  CGI will just require some 
configuration with respect to the environment.  Anyway, I suspect 
CGI only really works because: (a) people using CGI are sticking to 
ASCII, (b) they've fixed stuff up in their apps, (c) they just 
produce garbage and no one cares.


Ok.

There are some simple errata, most of which I believe web3 covers 
(in addition to other things it covers).


I think everyone is on board with:

  status, headers, app_iter = app(environ)

Web3 proposed a different order, but it seems clear from the thread 
that people prefer the more natural order, and web3 authors don't 
particularly object.


My comments were about releasing a WSGI 1.0 update for Python 3, not 
making changes to web3.  The current free-for-all (and the 3.2 stdlib 
need) have convinced me to stop arguing for throwing out WSGI 1 on Python 3.


Or, to put it another way: splitting the spec into two 100% 
incompatible versions is a bad idea for Python 3 adoption.  With a 
WSGI 1 addendum, we should be able to make it possible to put the 
same apps and middleware on 2 and 3 with just a decorator wrapping 
them.  (i.e., people should be able to write libraries that run on 
both 2 and 3, which is probably critical to adoption).


I just wish I'd come to these conclusions much sooner...  like a year 
or two ago.  :-(


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread P.J. Eby

At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote:

No comments on the rest except to note that at this point it looks
unlikely that we can make everyone happy (or even get an agreement to
adopt what would be the long-term technically optimal solution --
AFAICT there is no agreement on what that solution would be, if one
weren't to take porting Python 2 code into account). IOW
something/sokebody has gotta give.


Indeed.  This entire discussion has pushed me strongly in favor of 
doing a super-minimalist update to PEP 333 with the following points:


* Clarifying the encoding of environ values (locale+surrogateescape 
vs. latin1, TBD)


* Making the streams and all output values byte strings ('str' on 
2.x, 'bytes' on 3.x), leaving everything else "native" strings ('str' 
on both 2.x and 3.x)


* Any other minor errata/clarifications that the folks with the 
requisite experience (e.g. Robert, Ian, Graham -- not an exclusive 
list, but at least they all have both heavy WSGI implementations 
under their belts and 3.x experience) think are absolutely necessary 
to resolve open questions for Python 3.2 WSGI implementations.


Something like that has a halfway decent chance of being able to 
settle and get implemented in the short timeline, and it also doesn't 
put Graham (mod_wsgi) in the position of coming back from vacation to 
a huge new spec to unravel.  ;-)


(To be clear, what I'm suggesting is almost exactly what mod_wsgi 
does; it's just stricter on outputs than what mod_wsgi accepts, and 
there may be some minor issues regarding the environ encoding: 
mod_wsgi is probably using the latin1 approach rather than 
locale+surrogateescape, and I think we need to talk that one out a bit.)


Anyway, web3 is nice, but it doesn't look like it'll really fit the 
bill for porting applications.  i.e., it's like a bike shed full of 
red herrings for what Python-Dev needs right now.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-15 Thread P.J. Eby

At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote:

A PEP was submitted and accepted today for a WSGI successor protocol
named Web3:

http://python.org/dev/peps/pep-0444/

I'd encourage other folks to suggest improvements to that spec or to
submit a competing spec, so we can get WSGI-on-Python3 settled soon.


The first thing I notice is that web3.async appears to force all 
existing middleware to delete it from the environment if it wishes to 
remain compatible, unless it adapts to support receiving callables itself.


On further reading I see you have something about middleware 
disabling itself if it doesn't support async execution, but this 
doesn't make any sense to me: if it can't support async execution, 
why wouldn't it just delete web3.async from the environ, forcing its 
wrapped app to be synchronous instead?


I'm also not a fan of the bytes environ, or the new 
path_info/script_name variables; note that the spec's sample CGI 
implementation does not itself provide the new variables, and that 
middleware must be explicitly written to handle the case where there 
is duplication.


My main fear with this spec is that people will assume they can just 
make a few superficial changes to run WSGI code on it, when in fact 
it is deeply incompatible where middleware is concerned.  In fact, 
AFAICT, it seems like it will be *harder* to write correct web3 
middleware than it is to write correct WSGI middleware now.


This seems like a step backward, since the whole idea behind dropping 
start_response() was to make correct middleware *easier* to write.


Any time a spec makes something optional or allows More Than One Way 
To Do It, it immediately doubles the mimimum code required to 
implement that portion of the spec in compliant middleware.  This 
spec has two optionalities: web3.async, and the optional 
path_info/script_name, so the return handling of every piece of 
middleware is doubled (or else  "environ['web3.async'] = False" must 
be added at the top), and any code that modifies paths must similarly 
ditch the special variables or do double work to update them.
  


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-08-30 Thread P.J. Eby

At 02:37 PM 8/30/2010 +1000, Graham Dumpleton wrote:

Anyway, rather than keep arguing the point and move forward, let us
perhaps start now with the following definitions and new names to
identify them. We can even go a bit stupid and give each its own code
name so they are in part more memorable. Any next option based on your
suggestions about changing the WHEAT option can be called MAIZE. And
if you thinking I am going stark raving mad and should be put in a
white jacket and locked up, you could well be right. I am not a happy
camper right now, but that is because of many things besides this WSGI
stuff. :-)

 And yes I know about the page that has been just recently put up at:

  http://www.wsgi.org/wsgi/Python_3

From memory when I first read it I wasn't sure if that it was
completely accurate, but at least it doesn't now mention mod_python
instead of mod_wsgi which was mighty confusing. We can perhaps merge
the following into that page, ie., expand the table, and talk more
about the abstract definitions rather than linking it to specific
implementations at this point. We can perhaps then start capturing the
pros and cons against each option in the page rather than loosing them
in the email chain.


I've added a column to the page called "flat" that captures my 
current proposal (native keys, surrogateescape values, byte stream 
in, strict bytes-only for all outputs).  This seems to me an optimum 
balance between:


* Verifiability (especially *composable* verifiability)
* Low cognitive overhead (i.e., fewest things to remember)
* Low amount of finger-typing and fewer conversions

But I certainly could be convinced otherwise by example or argument.

(One other thing I consider a plus for this approach, btw: os.environ 
is still largely usable as a WSGI environ in the CGI case.  This 
isn't so much a valuable thing in itself, as that it's an indicator 
of low complexity and cognitive overhead.) 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-08-29 Thread P.J. Eby

At 11:16 AM 8/30/2010 +1000, Graham Dumpleton wrote:

Although I almost begged that if we are going to discuss bytes,
compared to text/unicode, that agreement at least first be made about
the definition of the bytes leaning option, that request has pretty
well fallen on death ears.


Did you not see my reply?  I (thought I) answered your question, and 
I actually also suggested that a variation of your unicode proposal 
might work, too.  See:


http://mail.python.org/pipermail/web-sig/2010-August/004545.html

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-08-27 Thread P.J. Eby

At 02:17 PM 8/27/2010 +1000, Graham Dumpleton wrote:

Since the major stumbling block, irrespective of other changes, to any
sort of agreement is still bytes vs unicode, and where we have a
reasonable clear definition of what unicode suggestion is, can we
please as a first step get a definition of what bytes actually implies
so everyone knows what we are talking about. I specifically ask this,
as it isn't clear because people don't explain in detail what they
mean when they are saying 'bytes'.

Going back to my definition #2 in my blog post from a year ago, I had:

1. The application is passed an instance of a Python dictionary
containing what is referred to as the WSGI environment. All keys in
this dictionary are native strings. For CGI variables, all names are
going to be ISO-8859-1 and so where native strings are unicode
strings, that encoding is used for the names of CGI variables


FYI, one thing that's changed here is the existence of os.environb in 
Python 3.2, at least on non-Windows OSes.




2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI
environment, the value of the variable should be a native string.


Since any meaningful use of this value is going to end up needing to 
be bytes again (e.g. Location headers), and for consistency's sake, I 
lean towards saying this is bytes too.




3. For the CGI variables contained in the WSGI environment, the values
of the variables are byte strings.

4. The WSGI input stream 'wsgi.input' contained in the WSGI
environment and from which request content is read, should yield byte
strings.

5. The status line specified by the WSGI application must be a byte string.

6. The list of response headers specified by the WSGI application must
contain tuples consisting of two values, where each value is a byte
string.

7. The iterable returned by the application and from which response
content is derived, must yield byte strings.

The points of disagreement I have seen about this is are as follows.

For (1), the keys should also be bytes, including names of 'wsgi.' 
special keys.


For (2), the value of 'wsgi.url_scheme' should be bytes.

So, do you really want bytes absolutely everywhere, or are keys still
going to be unicode taken as ISO-8859-1.


If we follow the example of os.environb, then the keys have to be bytes also.

However, I can already see that the big problem with all of this is 
that WSGI code is going to be littered with a plague of "b"s hanging 
off the front of every string literal, and that 2to3 is probably not 
going to handle it correctly.  Making the keys bytes as well just 
multiplies the problem.





Note that we are not agreeing to the final solution here, just what
bytes means in contrast to the unicode option, so we know that we are
comparing only two options and not many options because people have
different interpretations of what bytes means.

As contrast, what we generally mean by the unicode option is
definition #3 from my blog post. That being:

1. The application is passed an instance of a Python dictionary
containing what is referred to as the WSGI environment. All keys in
this dictionary are native strings. For CGI variables, all names are
going to be ISO-8859-1 and so where native strings are unicode
strings, that encoding is used for the names of CGI variables

2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI
environment, the value of the variable should be a native string.

3. For the CGI variables contained in the WSGI environment, the values
of the variables are native strings. Where native strings are unicode
strings, ISO-8859-1 encoding would be used such that the original
character data is preserved and as necessary the unicode string can be
converted back to bytes and thence decoded to unicode again using a
different encoding.

4. The WSGI input stream 'wsgi.input' contained in the WSGI
environment and from which request content is read, should yield byte
strings.

5. The status line specified by the WSGI application should be a byte
string. Where native strings are unicode strings, the native string
type can also be returned in which case it would be encoded as
ISO-8859-1.

6. The list of response headers specified by the WSGI application
should contain tuples consisting of two values, where each value is a
byte string. Where native strings are unicode strings, the native
string type can also be returned in which case it would be encoded as
ISO-8859-1.

7. The iterable returned by the application and from which response
content is derived, should yield byte strings. Where native strings
are unicode strings, the native string type can also be returned in
which case it would be encoded as ISO-8859-1.

Even though call it unicode, it actually has bytes in places as well.
The key issues over bytes vs unicode has been in values in the
dictionary, but as pointed out about, not clear whether for bytes
option, we are talking about bytes for keys as well and for value of
'wsgi.url_scheme'.


The

Re: [Web-SIG] WSGI for Python 3

2010-08-27 Thread P.J. Eby

At 06:05 PM 8/27/2010 +0200, Christoph Zwerschke wrote:

 For instance,

user = 'özkan'.encode('latin1')
if user in request.META.get('REMOTE_USER', b'').lower():

will not work it the user has logged in as 'Özkan'.


Isn't that a problem with code that does this now? 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-08-26 Thread P.J. Eby

At 01:37 AM 8/27/2010 +0200, Armin Ronacher wrote:

Hi,

Is there a status update on that now I missed?  Did something decide 
on bytes for the environment values or are we still unsure about that?


To the extent we're "unsure", I think the holdup is simply that 
nobody has tried doing an all-bytes WSGI implementation -- unless of 
course you count all our Python 2.x experience as experience with an 
all-bytes implementation.  ;-)


(Of course, that experience won't help us with Python 3 stdlib issues.)


At that point I don't care at all about what is decided on as long 
as something is decided.  Can someone please stand up and just do that? :)


Essentially the problem right now is that unless such a choice is 
made, there's little hope of getting the stdlib issues to be 
resolved, because we can't exactly file bug reports against the 
stdlib if we don't know what we want it to do.  ;-)


My personal inclination is to define WSGI 2 as a bytes-oriented 
protocol, and then encourage people to port to WSGI 2 before moving 
to Python 3.


In theory, if we did it correctly it could actually minimize the 
porting pain for Python 3.


In practice, I'm not sure how to do this, as I lack experience with 
2to3 at the moment, or any production experience with Python 3 whatsoever.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-18 Thread P.J. Eby

At 01:01 PM 7/18/2010 +1000, Graham Dumpleton wrote:

This is on the basis that if people are going to have to rewrite their code
a fair bit to handle bytes everywhere,


What you mean by "rewrite their code a fair bit", and who is it that 
you think will have to do this?


Or, more precisely, how is that any different from the text or 
text-and-bytes proposals?  AFAICT, the main difference is that under 
a bytes-only regime, the changes should be more 
consistent/mechanical, i.e., able to be performed by relatively 
superficial code inspection.




My personal opinion is that if you are going to go bytes everywhere,
then you may as well throw out the complete WSGI specification as it
stands now and fix all the other problems with the specification.


That may not be a bad idea; I'm certainly in favor of going ahead and 
ditching start_response/write while we're at it.  The requirement to 
change both the entry and exit points to match the calling convention 
also seems to provide an ideal opportunity to insert any necessary 
encoding or decoding operations.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread P.J. Eby

At 07:20 PM 7/16/2010 -0400, Chris McDonough wrote:

I'd much rather say be able to say:

"""
The PATH_INFO environment variable is a ``bytes-with-benefits`` type.
To decode it:

- First, split it on slashes::

segments = PATH_INFO.split('/')

- Then, de-encode each segment's urlencoded portions:

urldecoded_segments = [ urllib.unquote(x) for x in segments ]

- Then re-encode each urldecoded segment into the encoding expected
  by your application

app_segments = [ str(x, encoding='utf-8') for x in
 urldecoded_segments ]
"""


+1.  I do wish we actually *had* a bytes-with-benefits type (as I 
proposed on Python-Dev), but I don't think we can really get one 
until the language moratorium is over.  Plain old bytes are the next 
best thing. 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread P.J. Eby

At 05:42 PM 7/16/2010 -0400, Tres Seaver wrote:

P.J. Eby wrote:

> (Hm.  Although actually, I suppose we *could* just borrow the time
> machine and pretend that WSGI called for "byte-strings everywhere"
> all along...)

I like the idea of pushing responsibility for decoding stuff into the
framework / app writer's hands.  OTOH, doesn't that hose authors of
existing middleware, due to the borkedness of working with bytes in Python3?


It only creates a "new" problem if they are currently not using *any* 
unicode in 2.x, and are passing through bytes from the input to the 
output without any encoding or decoding.  AFAICT, if any part of 
their app is currently unicode, they would have the same problems in 2.x.


(Minus, of course, any problems introduced by missing bytes methods 
in 3.x, or the fact that single-subscripted bytes are ints rather 
than bytestrings.)


Anyway, the problems introduced will be problems that can be solved 
by waving a fairly standard set of dead chickens at the problem, i.e. 
picking where you're going to encode/decode, and deciding what 
encoding(s) are meaningful to your app.  And frameworks that already 
have a unicode API are ahead of the game here.


So, AFAICT, the only people who'd be punished by a change to bytes 
are the people who have non-ASCII inputs or outputs, but haven't been 
using unicode (because 2to3 will convert them to using strings 
instead of bytes).


From what I can tell, though, this is also the group it's most 
politically correct to hate on in Python-Dev, so we should be 
relatively safe in shifting the burden to them.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread P.J. Eby

At 02:28 PM 7/16/2010 -0500, Ian Bicking wrote:
On Fri, Jul 16, 2010 at 1:40 PM, P.J. Eby 
<<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote:

At 11:07 AM 7/16/2010 -0500, Ian Bicking wrote:
And this doesn't help with Python 3: either we have byte values of 
SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.  I 
think bytes will be more awkward to port to than text, and 
inconsistent with other WSGI values.



OTOH, it has the tremendous advantage of pushing the encoding 
question onto the app (or framework) developer... Â who's really the 
only one who can make the right decision for their particular 
application. Â And personally, I'd rather have clear boundaries 
between text and bytes, such that porting (even if tedious or 
awkward) is *consistent*, and clear as to when you're finished, not, 
"oh, did I check to make sure I converted SCRIPT_NAME and 
PATH_INFO... Â not just in my app code, but in all the library code 
I call *from* my app?"


IOW, the bytes/string discussion on Python-dev has kind of led me to 
realize that we might just as well make the *entire* stack bytes 
(incoming and outgoing headers *and* streams), and rewrite that bit 
in PEP 333 about using str on "Python 3000" to say we go with bytes 
on Python 3+ for everything that's a str in today's WSGI.



This was my first intuition too, until I started thinking in more 
detail about the particular values involved.  Some obviously are 
textish, like environ['SERVER_NAME'].  Not a very useful value, but 
definitely text.


Basically all the internal strings are textish, so we're left with:

wsgi.url_scheme
SCRIPT_NAME/PATH_INFO
QUERY_STRING
HTTP_*, CONTENT_TYPE, CONTENT_LENGTH (headers)
response status
response headers (name and value)


What I'm getting at, though, is it's precisely this sort of "hm, 
which ones are bytes again?" stuff that makes you have to stop and 
*think*, i.e., it doesn't Fit My Brain any more.  ;-)


There should be one, and preferably *only* one, obvious way to do it.

And given that HTTP is inherently a bunch of bytes, bytes is the one 
obvious way.


I previously was under the impression that bytes wouldn't 
interoperate with strings in 3.x, but they *do*, in much the same way 
as they did in 2.x.  That means you'll be (mostly) bug-compatible in 
3.x, only you'll likely encounter encoding issues *sooner*, rather 
than later.  (i.e., the minute you combine non-ASCII inputs with your 
regular string constants).


Yes, you will also be forced to convert your return values to bytes, 
but if you've used string constants *anywhere*, then you know you'll 
be outputting text, which you should already have been encoding for 
output.  (So you'll just be forced to deal with errors on that side 
sooner as well.)


All in all, I'd say this also fits with what people on Python-Dev 
keep hammering on as the One Obvious Way to deal with bytes and 
strings in a program: i.e., bytes for I/O, text for text processing.


WSGI is HTTP, and HTTP is I/O, ergo, WSGI is I/O, and we should 
therefore "byte" the bullet here.  ;-)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread P.J. Eby

At 11:07 AM 7/16/2010 -0500, Ian Bicking wrote:
And this doesn't help with Python 3: either we have byte values of 
SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.  I 
think bytes will be more awkward to port to than text, and 
inconsistent with other WSGI values.


OTOH, it has the tremendous advantage of pushing the encoding 
question onto the app (or framework) developer...  who's really the 
only one who can make the right decision for their particular 
application.  And personally, I'd rather have clear boundaries 
between text and bytes, such that porting (even if tedious or 
awkward) is *consistent*, and clear as to when you're finished, not, 
"oh, did I check to make sure I converted SCRIPT_NAME and 
PATH_INFO...  not just in my app code, but in all the library code I 
call *from* my app?"


IOW, the bytes/string discussion on Python-dev has kind of led me to 
realize that we might just as well make the *entire* stack bytes 
(incoming and outgoing headers *and* streams), and rewrite that bit 
in PEP 333 about using str on "Python 3000" to say we go with bytes 
on Python 3+ for everything that's a str in today's WSGI.


Or, to put it another way, if I knew then what I know *now*, I think 
I'd have written the PEP the other way around, such that the use of 
'str' in WSGI would be a substitute for the future 'bytes' type, 
rather than viewing some byte strings as a forward-compatible 
substitute for Py3K unicode strings.


Of course, this would be a WSGI 2 change, but IMO we're better off 
making a clean break with backward compatibility here anyway, rather 
than having conditionals.  Also, going with bytes everywhere means we 
don't have to rename SCRIPT_NAME and PATH_INFO, which in turn avoids 
deeper rewrites being required in today's apps.


(Hm.  Although actually, I suppose we *could* just borrow the time 
machine and pretend that WSGI called for "byte-strings everywhere" 
all along...)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread P.J. Eby

At 12:33 PM 6/29/2010 -0600, Aaron Fransen wrote:
I was sending text/html (I probably should have used multipart 
before) ... should I try multipart now, even with having everything 
in a single stream?


Heck if I know.  I just assumed that what you're doing would be 
unlikely to work, whereas multipart has at least been previously 
documented as working with Apache (at least for nph scripts).  Dunno 
if mod_wsgi'll do that or not.


Actually, what I'd do in your place is try a "nph-" CGI in Python 
(using a wsgiref CGIHandler with its 'origin_server' attribute set to 
True), have it send multipart, and see if that works.  If it doesn't 
work, then it's probably a problem with your app.


If it *does* work, but the same app doesn't work under mod_wsgi, then 
it's a mod_wsgi issue; possibly related to configuration.  From what 
Graham's said, mod_wsgi shouldn't be buffering anything, which means 
it has to either be Apache or your app that's buffering.  If it's 
Apache, doing a proper nph+multipart ought to fix it, unless there's 
something else going on in the Apache configuration.



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread P.J. Eby

At 10:14 AM 6/29/2010 -0600, Aaron Fransen wrote:

Couple more things I've been able to discern.

The first happened after I "fixed" the html code. Originally under 
mod_python, I guess I was cheating more than a little bit by sending 
 code blocks twice, once for the incremental notices, 
once for the final content. Once I changed the code to send a single 
properly parsed block, the entire document showed up as expected, 
however it still did not send any part of the html incrementally.


Watching the line with Wireshark, all of the data was transmitted at 
the same time, so nothing was sent to the browser incrementally.


So, you're not sending a multipart/x-mixed-replace ("server push") 
transmission? 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread P.J. Eby

At 03:43 PM 6/28/2010 -0600, Aaron Fransen wrote:

Using mod_wsgi on Apache doesn't seem to exhibit that behavior.


You may need "WSGIOutputBuffering Off" in your config; see:

http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIOutputBuffering

Another possibility is that you've got some middleware or something 
else buffering between your app and mod_wsgi, I suppose.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread P.J. Eby

At 01:01 PM 6/28/2010 -0600, Aaron Fransen wrote:

One of the nice things about mod_python is the req.write() function.

Although I realize it's somewhat of an abuse to the http protocol, 
it's handy being able to periodically update the client browser with 
a status message for a long-running job.


So handy in fact that I have a number of applications that rely 
fairly heavily on it as a means of keeping the client (person) happy 
instead of just showing them the default "browser busy" notification.


There are a couple of workarounds, neither of which are ideal:
1. Take them immediately to a secondary page, then submit the actual 
job automatically on that second page.

2. Instead of using HTTP POST, use an HTTP Request Object (ie. Ajax).

Both of them involve significantly more development effort than an 
equivalent req.write().


Is there a way to emulate the periodic-write functionality in WSGI?


Each string yielded (or passed to the write() callable returned by 
start_response) is supposed to be sent straight through to the client.


As long as your WSGI stack is actually conformant to the protocol, 
that's all you need to do.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-12 Thread P.J. Eby

At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote:

The purpose of the extension if to just have a standard interface that
WSGI applications can use to take advantage of the possibility, offered
by asynchronous server, to suspend execution and resume it later.


WSGI has this ability now - it's yielding an empty string.  Yielding 
an empty string is a hint to the server that the application is not 
ready to send any output, and the server is free to schedule other 
applications next.  And WSGI does not require the application to be 
rescheduled any time soon.


In other words, if saying "don't call me for a while" is the purpose 
of the extension, it is not needed.  As Graham says, the thing that 
would actually be needed is a way to tell the server when to poll the 
app again.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi and generators (was Re: WSGI and start_response)

2010-04-10 Thread P.J. Eby

At 02:04 PM 4/10/2010 +0100, Chris Dent wrote:

I realize I'm able to build up a complete string or yield via a
generator, or a whole bunch of various ways to accomplish things
(which is part of why I like WSGI: that content is just an iterator,
that's a good thing) so I'm not looking for a statement of what is or
isn't possible, but rather opinions. Why is yielding lots of moderately
sized strings *very bad*? Why is it _not_ very bad (as presumably
others think)?


How bad it is depends a lot on the specific middleware, server 
architecture, OS, and what else is running on the machine.  The more 
layers of architecture you have, the worse the overhead is going to be.


The main reason, though, is that alternating control between your app 
and the server means increased request lifetime and worsened average 
request completion latency.


Imagine that I have five tasks to work on right now.  Let us say each 
takes five units of time to complete.  If I have five units of time 
right now, I can either finish one task now, or partially finish 
five.  If I work on them in an interleaved way, *none* of the tasks 
will be done until twenty-five units have elapsed, and so all tasks 
will have a completion latency of 25 units.


If I work on them one at a time, however, then one task will be done 
in 5 units, the next in 10, and so on -- for an average latency of 
only 15 units.  And that is *not* counting any task switching overhead.


But it's *worse* than that, because by multitasking, my task queue 
has five things in it the whole time...  so I am using more memory 
and have more management overhead, as well as task switching overhead.


If you translate this to the architecture of a web application, where 
the "work" is the server serving up bytes produced by the 
application, then you will see that if the application serves up 
small chunks, the web server is effectively forced to multitask, and 
keep more application instances simultaneously running, with lowered 
latency, increased memory usage, etc.


However, if the application hands either its entire output to the 
server, then the "task" is already *done* -- the server doesn't need 
the thread or child process for that app anymore, and can have it do 
something else while the I/O is happening.  The OS is in a better 
position to interleave its own I/O with the app's computation, and 
the overall request latency is reduced.


Is this a big emergency if your server's mostly idle?  Nope.  Is it a 
problem if you're writing a CGI program or some other direct API that 
doesn't automatically flush I/O?  Not at all.  I/O buffering works 
just fine for making sure that the tasks are handed off in bigger chunks.


But if you're coding up a WSGI framework, you don't really want to 
have it sending tiny chunks of data up a stack of middleware, because 
WSGI doesn't *have* any buffering, and each chunk is supposed to be 
sent *immediately*.


Well-written web frameworks usually do some degree of buffering 
already, for API and performance reasons, so for simplicity's sake, 
WSGI was spec'd assuming that applications would send data in 
already-buffered chunks.


(Specifically, the simplicity of not needing to have an explicit 
flushing API, which would otherwise have been necessary if middleware 
and servers were allowed to buffer the data, too.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread P.J. Eby

At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:

Suppose I have an HTML template file, and I want to use a sub request.

...
${subrequest('/header/'}
...

The problem with this code is that, since Mako will buffer all generated
content, the result response body will contain incorrect data.

It will first contain the response body generated by the sub request,
then the content generated from the Mako template (XXX I have not
checked this, but I think it is how it works).


Okay, I'm confused even more now.  It seems to me like what you've 
just described is something that's fundamentally broken, even if 
you're not using WSGI at all.




So, when executing a sub request, it is necessary to flush (that is,
send to Nginx, in my case) the content generated from the template
before the sub request is done.


This seems to only makes sense if you're saying that the subrequest 
*has to* send its output directly to the client, rather than to the 
parent request.  If the subrequest sends its output to the parent 
request (as a sane implementation would), then there is no 
problem.  Likewise, if the subrequest is sent to a buffer that's then 
inserted into the parent invocation.


Anything else seems utterly insane to me, unless you're basically 
taking a bunch of legacy CGI code using 'print' statements and 
hacking it into something else.  (Which is still insane, just 
differently. ;-) )




Ah, you are right sorry.
But this is not required for the Mako example (I was focusing on that
example).


As far as I can tell, that example is horribly wrong.  ;-)



But when using the greenlet middleware, and when using the function for
flushing Mako buffer, some data will be yielded *before* the application
returns and status and headers are passed to Nginx.


And that's probably because sharing a single output channel between 
the parent and child requests is a bad idea.  ;-)


(Specifically, it's an increase in "temporal coupling", I believe.  I 
know it's some kind of coupling between functions that's considered 
bad, I just don't remember if that's the correct name for it.)




> This is also a good time for people to learn that generators are usually
> a *very bad* way to write WSGI apps

It's the only way to be able to suspend execution, when the WSGI
implementation is embedded in an async web server not written in Python.


It's true that dropping start_response() means you can't yield empty 
strings prior to determining your headers, yes.




> - yielding is for server push or
> sending blocks of large files, not tiny strings.

Again, consider the use of sub requests.
yielding a "not large" block is the only choice you have.


No, it isn't.  You can buffer your output and yield empty strings 
until you're ready to flush.





Unless, of course, you implement sub request support in pure Python (or
using SSI - Server Side Include).


I don't see why it has to be "pure", actually.  It just that the 
subrequest needs to send data to the invoker rather than sending it 
straight to the client.


That's the bit that's crazy in your example -- it's not a scenario 
that WSGI 2 should support, and I'd consider the fact that WSGI 1 
lets you do it to be a bug, not a feature.  ;-)


That being said, I can see that removing start_response() closes a 
loophole that allows async apps to *potentially* exist under WSGI 1 
(as long as you were able to tolerate the resulting crappy API).


However, to fix that crappy API requires greenlets or threads, at 
which point you might as well just use WSGI 2.  In the Nginx case, 
you can either do WSGI 1 in C and then use an adapter to provide WSGI 
2, or you can expose your C API to Python and write a small 
greenlets-using Python wrapper to support suspending.  It would look 
something like:


def gateway(request_info, app):
# set up environ
run(greenlet(lambda: Finished(app(environ

def run(child):
while not child.dead:
 data = child.switch()
 if isinstance(data, Finished):
  send_status(data.status)
  send_headers(data.headers)
  send_response(data.response)
 else:
 perform_appropriate_action_on(data)
 if data.suspend:
 # arrange for run(child) to be re-called later, then...
 return

Suspension now works by switching back to the parent greenlet with 
command objects (like Finished()) to tell the run() loop what to 
do.  The run() loop is not stateful, so when the task is unsuspended, 
you simply call run(child) again.


A similar structure would exist for send_response() - i.e., it's a 
loop over the response, can break out of the loop if it needs to 
suspend, and arranges for itself to be re-called at the appropriate time.


Voila - you now have asynchronous WSGI 2 support.

Now, whether you actually *want* to do that is a separate question, 
but as (I hope) you can see, you definitely *can* do

Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread P.J. Eby

At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote:

What I'm trying to do is:

* as in the example I posted, turn Mako render function in a generator.

  The reason is that I would lite to to implement support for Nginx
  subrequests.


By subrequest, do you mean that one request is invoking another, like 
one WSGI application calling multiple other WSGI applications to 
render one page containing contents from more than one?




  During a subrequest, the generated response body is sent directly to
  the client, so it is necessary to be able to flush the Mako buffer


I don't quite understand this, since I don't know what Mako is, or, 
if it's a template engine, what flushing its buffer would have to do 
with WSGI buffering.




> Under
> WSGI 1, you can do this by yielding empty strings before calling
> start_response.

No, in this case this is not what I need to do.


Well, if that's not when you're needing to suspend the application, 
then I don't see what you're losing in WSGI 2.




I need to call start_response, since the greenlet middleware will yield
data to the caller before the application returns.


I still don't understand you.  In WSGI 1, the only way to suspend 
execution (without using greenlets) prior to determining the headers 
is to yield empty strings.


I'm beginning to wonder if maybe what you're saying is that you want 
to be able to write an application function in the form of a 
generator?  If so, be aware that any WSGI 1 app written as:


 def app(environ, start_response):
 start_response(status, headers)
 yield "foo"
 yield "bar"

can be written as a WSGI 2 app thus:

 def app(environ, start_response):
 def respond():
 yield "foo"
 yield "bar"
 return status, headers, respond()

This is also a good time for people to learn that generators are 
usually a *very bad* way to write WSGI apps - yielding is for server 
push or sending blocks of large files, not tiny strings.  In general, 
if you're yielding more than one block, you're almost certainly doing 
WSGI wrong.  The typical HTML, XML, or JSON output that's 99% of a 
webapp's requests should be transmitted as a single string, rather 
than as a series of snippets.


IOW, the absence of generator support in WSGI 2 is a feature, not a bug.



In my new attempt I plan to:

1) Implement the simple suspend/resume extension
2) Implement a Python extension module that wraps the Nginx events
   system.
3) Implement a pure Python WSGI middleware that, using greenlets, will
   enable normal applications to take advantage of Nginx async features.


I think maybe I'm understanding a little better now -- you want to 
implement the WSGI gateway entirely in C, without using any Python, 
and without using the greenlet API directly.


I think I've been unable to understand because I'm thinking in terms 
of a server implemented in Python, or at least that has the WSGI part 
implemented in Python.




Do you think it will possible to implement all the requirements of WSGI
2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ?


My practical experience with Python 3 is essentially nonexistent, but 
being able to implement WSGI 2 in terms of WSGI 1 is a *design 
requirement* for WSGI 2; it's likely that much early use and 
development of WSGI 2 will be done through such an adapter.




And what about applications that need to use the WSGI 1.0 API but
require to run with Python 3.x?


That's a tougher nut to crack; again, my practical experience with 
Python 3 is essentially nonexistent.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread P.J. Eby

At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote:

With WSGI 2.0 we will end up with:

- WSGI 1.0, a full featured protocol, but with hard to implement
  middlewares
- WSGI 2.0, a simple protocol, with more easy to implement middlewares
  but without support for some "advanced" applications


Let me see if I understand what you're saying.  You want to support 
suspending an application, without using greenlets or threads.  Under 
WSGI 1, you can do this by yielding empty strings before calling 
start_response.  Under WSGI 2, you can only do this by directly 
suspending execution, e.g. via greenlet or eventlets or some similar 
API provided by the server.  Is this your objection?


As far as I know, nobody has actually implemented an async app 
facility for WSGI 1, although it sounds like perhaps you're trying to 
design or implement such a thing now.  If so, then there's nothing 
stopping you from implementing a WSGI 1 server and providing a WSGI 2 
adapter, since as you point out, WSGI 2 is easier to implement on top 
of WSGI 1 than the other way around.


(Note, however, that if you simply use a greenlet or eventlet-based 
API for your async server, then the problem is neatly solved whether 
you are using WSGI 1 or 2, and the effective API is a lot cleaner 
than yielding empty strings.) 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread P.J. Eby

At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote:

Aaron Watters ha scritto:
> someone remind me: where is the canonical WSGI 2 spec?

http://wsgi.org/wsgi/WSGI_2.0

> I assume there is a way to "wrap" WSGI 1 applications
> without breaking them?  Or is this the regex-->re fiasco
> all over again?
>

start_response can be implemented by a function that will store the
status code and response headers.

There should be a sample WSGI 2.0 implementation for CGI, and a sample
WSGI 1.0 -> 2.0 adapter.

This adapter should be able to support the coroutine example,
> http://paste.pocoo.org/show/199202/
but I would like to test.

write callable, as far as I know, can not be implemented.


Implementing it requires greenlets or threads, but it's implementable.  See:

http://mail.python.org/pipermail/web-sig/2009-September/003986.html

(Btw, I've noticed that this early sketch of mine doesn't support the 
case where an application is a generator, because start_response 
won't have been called when the application returns.  This can be 
fixed, but it requires the addition of a wrapper class and a few 
other annoying details.  It also doesn't support exc_info properly, 
so it's still a ways from being a correct WSGI 1 server 
implementation.  Getting rid of all these little variations, though, 
is the goal of having a WSGI 2 - it's difficult to write *any* 
middleware to be completely WSGI 1 compliant.)


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread P.J. Eby

At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote:

Hi.

Some time ago I objected the decision to remove start_response function
from next version WSGI, using as rationale the fact that without
start_callable, asynchronous extension are impossible to support.

Now I have found that removing start_response will also make impossible
to support coroutines (or, at least, some coroutines usage).

Here is an example (this is the same example I posted few days ago):
http://paste.pocoo.org/show/199202/

Forgetting about the write callable, the problem is that the application
starts to yield data when tmpl.render_unicode function is called.

Please note that this has *nothing* to do with asynchronus applications.
The code should work with *all* WSGI implementations.


In the pasted example, the Mako render_unicode function is "turned" into
a generator, with a simple function that allows to flush the current buffer.


Can someone else confirm that this code is impossible to support in WSGI
2.0?


I don't understand why it's a problem.  See my previous post here:

http://mail.python.org/pipermail/web-sig/2009-September/003986.html

for a sketch of a WSGI 1-to-2 converter.  It takes a WSGI 1 
application callable as the input, and returns a WSGI 2 function.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi.errors and close method

2010-03-27 Thread P.J. Eby

At 08:10 PM 3/27/2010 +0100, Manlio Perillo wrote:

Some time ago, someone reported me that an application embedded in Nginx
with my WSGI module failed to execute, since in my implementation the
wsgi.errors object does not implement the .close method.


We should probably note in the spec that WSGI applications have no 
business closing the errors object; ISTM it is a completely meaningless action.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


  1   2   >