Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-07 Thread Malthe Borch

On 12/4/09 12:50 AM, And Clover wrote:

So for now there is basically nothing useful WSGI can do other than
provide direct, byte-oriented (even if wrapped in 8859-1 unicode
strings) access to headers.


You could argue that this is perhaps a good reason to replace 
``environ`` with something that interprets the headers according to how 
HTTP is actually used in the real world.


It may be that WSGI should use bytes everywhere and the recommended 
usage would be via a decorator (which could cache computations on the 
environ dictionary):


e.g. the raw application handler versus one decorated with an imaginary 
``webob`` function.


  def app(environ, start_response):
  ...

  @webob
  def app(request):
  ...

It is often said that WSGI should be practical, but in actual usage, I 
think most developers use a request/response abstraction layer.


Middlewares are usually shrink-wrapped library code that could handle a 
bytes-based environ dict (they'd have to explicitly decode the headers 
of interest).


\malthe

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-04 Thread Manlio Perillo
Henry Precheur ha scritto:
 On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote:
 There is something that I don't understand.

 Some HTTP headers, like Accept-Language, contains data described as
 `token`, where:

 token  = 1*any CHAR except CTLs or separators

 So a token, IMHO, is an opaque string, and it SHOULD not decoded.
 In Python 3.x it SHOULD be a byte string.
 
 I think this is more an issue that frameworks should deal with. By
 decoding every headers value to latin-1:
 
 * It keeps WSGI simple. Simple is good.
 

It is just as simple as using byte strings, IMHO.
It is not simple, it is convenient because of (if I understand
correctly) how code is converted by 2to3.

 * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1)
   says. WSGI is about HTTP, but that doesn't necessarily includes all
   other standards extending HTTP.
 

HTTP never says to consided whole headers as latin-1 text, IMHO.

 * It's possible to convert latin-1 strings to bytes without losing data.
 

Yes, but it is quite stupid to first convert to Unicode and then convert
again to byte string.

It it true, however, that this does not happen often; but only for:

- WSGI applications that implement an HTTP proxy
- WSGI applications that needs to support HTTP Digest Authentication
- WSGI applications that store encoded data in cookies


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
James Y Knight ha scritto:
 I move to bless mod_wsgi's definition of WSGI 1.1 [1]
 [...]
 
 [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X

Hi.

Just a few questions.

It is true that HTTP headers can be encoded assuming latin-1; and they
can be encoded using PEP 383.

However what about URI (that is, for PATH_INFO and the like)?
For URI (if I remember correctly) the suggested encoding is UTF-8, so
URLS should be decoded using

  url.decode('utf-8', 'surrogateescape')

Is this correct?


Now another question.
Let's consider the `wsgiref.util.application_uri` function

def application_uri(environ):
url = environ['wsgi.url_scheme']+'://'
from urllib.parse import quote

if environ.get('HTTP_HOST'):
url += environ['HTTP_HOST']
else:
url += environ['SERVER_NAME']

if environ['wsgi.url_scheme'] == 'https':
if environ['SERVER_PORT'] != '443':
url += ':' + environ['SERVER_PORT']
else:
if environ['SERVER_PORT'] != '80':
url += ':' + environ['SERVER_PORT']

url += quote(environ.get('SCRIPT_NAME') or '/')
return url


There is a potential problem, here, with the quote function.
This function does the following:

def quote(string, safe='/', encoding=None, errors=None):
if isinstance(string, str):
if encoding is None:
encoding = 'utf-8'
if errors is None:
errors = 'strict'
string = string.encode(encoding, errors)

This means that if we use surrogateescape, the informations about
original bytes is lost here.

This can be easily fixed by changing the application_uri function, but
this also means that a WSGI application will not work with Python 3.1.x.


Finally, a question about cookies.
Cookie data SHOULD be transparent to the server/gateway; however WSGI is
going to assume that data is encoded in latin-1.

I don't know what the HTTP/Cookie spec says about this.
However, from a WSGI application point of view, the cookie data can, as
an example, contain some text encoded in UTF-8; this means that the
application must first encode the data:

  cookie_bytes = cookie.encode('latin-1', 'surrogateescape')

and then decode it using UTF-8:

  my_cookie_data = cookie_bytes.decode('utf-8')


This is a bit unreasonable, but I don't know if this is a common
practice (I do this, just to make an example).



Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread And Clover

Manlio Perillo wrote:


However what about URI (that is, for PATH_INFO and the like)?
For URI (if I remember correctly) the suggested encoding is UTF-8, so
URLS should be decoded using



  url.decode('utf-8', 'surrogateescape')



Is this correct?


The currently-discussed proposal is ISO-8859-1, allowing the real bytes 
to be trivially extracted. This is consistent with the other headers and 
would be my preferred approach.


Python 3.1's wsgiref.simple_server, on the other hand, blindly uses 
urllib.unquote, which defaults to UTF-8 without surrogateescape, 
mangling any non-UTF-8 input.


I don't really care whether UTF-8+surrogateescape or ISO-8859-1 encoding 
is blessed. But *something* needs to be blessed. An encoding, an 
alternative undecoded path_info, both, something else... just *something*.



Let's consider the `wsgiref.util.application_uri` function
There is a potential problem, here, with the quote function.


Yes. wsgiref is broken in Python 3.1. Not quite as broken as it was in 
3.0, but still broken. Until we can come to a Pronouncement on what WSGI 
*is* in Python 3, it is meaningless anyway.



Cookie data SHOULD be transparent to the server/gateway; however WSGI is
going to assume that data is encoded in latin-1.


Yeah. This is no big deal because non-ASCII characters in cookies are 
already broken everywhere(*). Given this and other limitations on what 
characters can go in cookies, they are habitually encoded using ad-hoc 
mechanisms handled by the application (typically a round of URL-encoding).


*: in particular:

- Opera and Chrome send non-ASCII cookie characters in UTF-8.
- IE encodes using the system codepage (which can never be UTF-8),
  mangling any characters that don't fit in the codepage through the
  traditional Windows 'similar replacement character' scheme.
- Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1
  gets through but everything else is mangled)
- Safari refuses to send any cookie containing non-ASCII characters.


I don't know what the HTTP/Cookie spec says about this.


The traditional interpretation of RFC2616 is that headers are ISO-8859-1.

You will notice that no browser correctly follows this.

...sigh.

--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/


--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
And Clover ha scritto:
 [...]
 Cookie data SHOULD be transparent to the server/gateway; however WSGI is
 going to assume that data is encoded in latin-1.
 
 Yeah. This is no big deal because non-ASCII characters in cookies are
 already broken everywhere(*). Given this and other limitations on what
 characters can go in cookies, they are habitually encoded using ad-hoc
 mechanisms handled by the application (typically a round of URL-encoding).
 
 *: in particular:
 
 - Opera and Chrome send non-ASCII cookie characters in UTF-8.
 - IE encodes using the system codepage (which can never be UTF-8),
   mangling any characters that don't fit in the codepage through the
   traditional Windows 'similar replacement character' scheme.
 - Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1
   gets through but everything else is mangled)
 - Safari refuses to send any cookie containing non-ASCII characters.
 

Thanks for this summary.
I think it should go in a wiki or in a separate document (like
rationale) to the WSGI spec.

However this should never happen with cookie, since cookie data is
opaque to browser, and it MUST send it as is.

What you describe happen with other headers containing TEXT.
And now I understand that strange behaviour of Firefox with non latin-1
strings in username, in HTTP Basic Authentication.

 [...]

Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread James Y Knight
On Dec 3, 2009, at 1:35 PM, And Clover wrote:
 Manlio Perillo wrote:
 
 However what about URI (that is, for PATH_INFO and the like)?
 For URI (if I remember correctly) the suggested encoding is UTF-8, so
 URLS should be decoded using
 
  url.decode('utf-8', 'surrogateescape')
 
 Is this correct?
 
 The currently-discussed proposal is ISO-8859-1, allowing the real bytes to be 
 trivially extracted. This is consistent with the other headers and would be 
 my preferred approach.

Right, for WSGI 1.1 on Python 3.x, 8859-1 strings is the plan. Other, more 
ideologically pure options can be discussed for an incompatible revision of 
WSGI (e.g. the hypothetical 2.0).

BTW: I hope to have a first draft of the changes by Monday. (But don't beat up 
on me if it's delayed; I am working on it.)

James
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Henry Precheur
On Thu, Dec 03, 2009 at 07:35:14PM +0100, And Clover wrote:
 I don't know what the HTTP/Cookie spec says about this.
 
 The traditional interpretation of RFC2616 is that headers are ISO-8859-1.
 
 You will notice that no browser correctly follows this.

The RFC 2109  2965 say that a cookie's value can be anything:

 The VALUE is opaque to the user agent and may be anything the origin
 server chooses to send, possibly in a server-selected printable ASCII
 encoding.

Theoricaly you could put something like: 'foo\n\0bar' in a cookie.

Also a cookie can include comments which have to be encoded using ...
UTF-8:

 Comment=value
   OPTIONAL.  Because cookies can be used to derive or store
   private information about a user, the value of the Comment
   attribute allows an origin server to document how it intends to
   use the cookie.  The user can inspect the information to decide
   whether to initiate or continue a session with this cookie.
   Characters in value MUST be in UTF-8 encoding.

-- 
  Henry PrĂȘcheur
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
And Clover ha scritto:
 Manlio Perillo wrote:
 
 However what about URI (that is, for PATH_INFO and the like)?
 For URI (if I remember correctly) the suggested encoding is UTF-8, so
 URLS should be decoded using
 
   url.decode('utf-8', 'surrogateescape')
 
 Is this correct?
 
 The currently-discussed proposal is ISO-8859-1, allowing the real bytes
 to be trivially extracted. This is consistent with the other headers and
 would be my preferred approach.
 

There is something that I don't understand.

Some HTTP headers, like Accept-Language, contains data described as
`token`, where:

token  = 1*any CHAR except CTLs or separators

So a token, IMHO, is an opaque string, and it SHOULD not decoded.
In Python 3.x it SHOULD be a byte string.

Text content is described as `TEXT`, where:

The TEXT rule is only used for descriptive field contents and values
that are not intended to be interpreted by the message parser. Words
of *TEXT MAY contain characters from character sets other than ISO-
8859-1 [22] only when encoded according to the rules of RFC 2047
[14].

TEXT   = any OCTET except CTLs,
 but including LWS


The only type of data where TEXT can be used is `quoted-string`.

A `quoted-string` only appears in well specified portions of an header.
So, IMHO, it is *not* correct for a WSGI middleware, to return all HTTP
headers as Unicode strings.

This is up to the application/framework, that must parse each header,
split it in component and handle them as more appropriate (as byte
string, Unicode string or instance of some other data type).


 [...]


Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Henry Precheur
On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote:
 There is something that I don't understand.
 
 Some HTTP headers, like Accept-Language, contains data described as
 `token`, where:
 
 token  = 1*any CHAR except CTLs or separators
 
 So a token, IMHO, is an opaque string, and it SHOULD not decoded.
 In Python 3.x it SHOULD be a byte string.

I think this is more an issue that frameworks should deal with. By
decoding every headers value to latin-1:

* It keeps WSGI simple. Simple is good.

* WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1)
  says. WSGI is about HTTP, but that doesn't necessarily includes all
  other standards extending HTTP.

* It's possible to convert latin-1 strings to bytes without losing data.

-- 
  Henry PrĂȘcheur
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread And Clover

Manlio Perillo wrote:


Words of *TEXT MAY contain characters from character sets other than
ISO-8859-1 [22] only when encoded according to the rules of RFC 2047


Yeah, this is, unfortunately, a lie. The rules of RFC 2047 apply only to 
RFC*822-family 'atoms' and not elsewhere; indeed, RFC2047 itself 
specifically denies that an encoded-word can go in a quoted-string.


RFC2047 encoded-words are not on-topic in an HTTP header(*); this has 
been confirmed by newer development work on HTTPbis by Reschke et al. 
(http://tools.ietf.org/wg/httpbis/).


The correct way of escaping header parameters in an RFC*822-family 
protocol would be RFC2231's complex encoding scheme, but HTTP is 
explicitly not an 822-family protocol despite sharing many of the same 
constructs. See 
http://tools.ietf.org/html/draft-reschke-rfc2231-in-http-06 for a 
strategy for how 2231 should interact with HTTP, but note that for now 
RFC2231-in-HTTP simply does not exist in any deployed tools.


So for now there is basically nothing useful WSGI can do other than 
provide direct, byte-oriented (even if wrapped in 8859-1 unicode 
strings) access to headers.


--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-30 Thread And Clover

Graham Dumpleton wrote:


Answering my own question, it is actually obvious that it has to be
called (1, 0). This is because wsgiref in Python 3.X already calls it
(1, 0) and don't have much choice to be in agreement with that.


wsgiref.simple_server in Python 3 to date is not something that anyone 
should worry about being compatible with. It is a 2to3 hack that cannot 
meaningfully claim to represent wsgi version anything.


Careless use of urllib.parse.unquote causes 3.0's simple_server not to 
work at all, and 3.1's to mangle the path by treating it as UTF-8 
instead of ISO-8859-1, as 'WSGI 1.1' proposed and mod_wsgi (and even 
mod_cgi via wsgiref.CGIHandler) delivered.


Yes, I'm always going on about Unicode paths. I'm fed up of shipping 
apps with a page-long deployment note about fixing them. It pains me 
that in so many years both this and What do we do about Python 3? 
still haven't been addressed.


mod_wsgi 3.0 already has more traction than wsgiref 3.1 and I would 
prefer not to see more farcical reverse-progress at this point.


For what it's worth my responses on the issues of this thread. But at 
this point I really just want a BDFL to just come and do it, whatever it 
is. A new WSGI, whatever the version number, is massively overdue.


 1. The 'readline()' function of 'wsgi.input' may optionally take a 
size hint.


Yes. Obviously. Bad practice but unavoidable now. Should have been a 1.0 
amendment a long time ago.


 2. The 'wsgi.input' must provide an empty string as end of input 
stream marker.
 3. The size argument to 'read()' function of 'wsgi.input' would be 
optional and if not supplied the function would return all available 
request content.
 4. The 'wsgi.file_wrapper' supplied by the WSGI adapter must honour 
the Content-Length response header and must only return from the file 
that amount of content.


+0. Seems reasonable but don't massively care. Presumably an application 
must refuse to run on 1.0 if it requires these behaviours?


 5. Any WSGI application or middleware should not return more data 
than specified by the Content-Length response header if defined.
 6. The WSGI adapter must not pass on to the server any data above 
what the Content-Length response header defines if supplied.


Yes.

--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-28 Thread Graham Dumpleton
After reading my prior blog posts where I explained my reasoning
behind the changes, I will acknowledge that I haven't explained some
stuff very well and people are failing to understand or getting wrong
idea about why something is being suggested.

I still believe there are though underlying problems there in the WSGI
specification and right now, more by luck than design is various stuff
working. In some cases such as readline(), the majority of WSGI
applications/frameworks are in violation of the WSGI 1.0 specification
due to their reliance on cgi.FieldStorage which makes calls to
readline() with an argument.

Either way, since there seemed to be objections at some level on every
point, and since I really really have no enthusiasm for this stuff any
more or of fighting for any change, I retract my personal interest in
having any of the amendments as part of a WSGI 1.1 specification and
will remove all that detail from mod_wsgi documentation. I will
instead replace it with a separate page describing mod_wsgi compliance
with WSGI 1.0 specification and highlighting those specific features
which are in common, or not so common use, via mod_wsgi and which
actually mean that people are writing applications incompatible with
the WSGI 1.0 specification.

To ensure compliance I could well raise Python exceptions for any use
which isn't WSGI 1.0 compliant, but I have already learnt from where I
tried get people to write portable WSGI applications by giving errors
on certain use of stdin and stdout, that it is a pointless battle. All
it got was a long list of users who believe mod_wsgi is broken even
though if they read the actual documentation they would find it was
their own software which was suspect or at least wasn't portable to
all WSGI hosting mechanisms. This would only get worse if exceptions
were raised for use of readline() with an argument and use of read()
with no argument or argument of -1. Short story is that there are a
fair few people who are just lazy, they will always write stuff the
way the want to and not how it should be written. They will always
blame other peoples code for being wrong before acknowledging they
themselves are wrong.

The only answer I therefore need out of WEB-SIG is whether the
qualifications about how Python 3.X is to be supported are going to be
an amendment to WSGI 1.0 or as a separate WSGI 1.1 update and whether
if the latter whether the WSGI 1.1 tag will also have meaning for
Python 2.X.

I need an answer to this so I know whether to withdraw mod_wsgi 3.0
from download and replace it with a mod_wsgi 4.0 which changes the
wsgi.version tuple being passed, for both Python 2.X and Python 3.X,
from (1, 1) back to original (1, 0), given that some opinion seems to
be that any interface changes can only really be performed as part of
WSGI 2.0 and so I would be wrong in using (1, 1).

If don't see an answer, then guess I will just have to revert it back
to (1, 0) to be safe and to avoid any accusations that am highjacking
the process.

An answer sooner rather than later would be appreciated on the
wsgi.version issue.

Graham

2009/11/28 Graham Dumpleton graham.dumple...@gmail.com:
 Please ensure you have also all read:

 http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html

 I will post again later in detail when have some time to explain a few
 more points not mentioned in that post and where people aren't quite
 understanding the reasoning for doing things.

 One very quick comment about read().

 Allowing read() with no argument is no different to a user saying
 read(environ['CONTENT_LENGTH']). Because a WSGI adapter/middleware is
 going to have to track bytes read to ensure can return an empty string
 as end sentinel, it will know length remaining and would internally
 for read() with no argument do read(remaining_bytes). As such no real
 differences in inefficiencies as far as memory use goes for
 implementing read() because of need to implement end sentinel.

 Also, you have concerns about read() with no argument, but frankly
 readline() with no argument, which is already required, is much worse
 because you cant really track bytes read and just read to end of
 input. This is because they only want to read to end of line and so
 reading all input is going to blow out memory use unreasonably as you
 speculate for read(). As such, a readline() implementation is likely
 to read in blocks and internally buffer where read() doesn't
 necessarily have to.

 It may also be pertinent to read:

 http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html

 as from memory it talks about issues with not paying attention to
 Content-Length on output filtering middleware as well.

 As I said, will reply later when have some time to focus. Right now I
 have a 2 year old to keep amused.

 Graham

 2009/11/27 James Y Knight f...@fuhm.net:
 I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official 
 definition of WSGI 1.1, which describes 

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-28 Thread James Y Knight
On Nov 28, 2009, at 10:44 PM, Graham Dumpleton wrote:
 Either way, since there seemed to be objections at some level on every
 point, and since I really really have no enthusiasm for this stuff any
 more or of fighting for any change, I retract my personal interest in
 having any of the amendments as part of a WSGI 1.1 specification and
 will remove all that detail from mod_wsgi documentation


[...]

 If don't see an answer, then guess I will just have to revert it back
 to (1, 0) to be safe and to avoid any accusations that am highjacking
 the process.
 
 An answer sooner rather than later would be appreciated on the
 wsgi.version issue.

I'd rather appreciate it if you held off on making such changes until either 
this discussion either peters out or is resolved. You sound somewhat negative, 
but it seems to me that there's actually quite close to being a consensus on 
adopting most of your proposal. Changing the proposal out from under us doesn't 
really help things.

The next step here is clearly for someone to redraft the changes as a diff 
against PEP 333. If you do not have any interest in being that person, please 
make that clear, so someone else can step up to do so.

James
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-28 Thread Graham Dumpleton
2009/11/29 James Y Knight f...@fuhm.net:
 On Nov 28, 2009, at 10:44 PM, Graham Dumpleton wrote:
 Either way, since there seemed to be objections at some level on every
 point, and since I really really have no enthusiasm for this stuff any
 more or of fighting for any change, I retract my personal interest in
 having any of the amendments as part of a WSGI 1.1 specification and
 will remove all that detail from mod_wsgi documentation


 [...]

 If don't see an answer, then guess I will just have to revert it back
 to (1, 0) to be safe and to avoid any accusations that am highjacking
 the process.

 An answer sooner rather than later would be appreciated on the
 wsgi.version issue.

 I'd rather appreciate it if you held off on making such changes until either 
 this discussion either peters out or is resolved. You sound somewhat 
 negative, but it seems to me that there's actually quite close to being a 
 consensus on adopting most of your proposal. Changing the proposal out from 
 under us doesn't really help things.

 The next step here is clearly for someone to redraft the changes as a diff 
 against PEP 333. If you do not have any interest in being that person, please 
 make that clear, so someone else can step up to do so.

No I do not want a part in drafting any changes, I just want to move
on from all this stuff and starting working on other projects. Since
though some don't seem to understand the reasons for the changes then
you will find it hard to find some who is in a position to be able to
do them.

You probably really are just better off worrying about Python 3.X
support and accept that tinkering at edges of WSGI 1.0 on other issues
is not going to solve all the WSGI issues. As PJE suggest, leave that
to an interface incompatible update so that you don't have this whole
problem of what version existing components support.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-27 Thread Aaron Watters
I second the move, recorded here:

  http://listtree.appspot.com/wsgi2/ICvaujouPxb2gfEhDS_aiw

-- Aaron Watters

--- On Thu, 11/26/09, James Y Knight f...@fuhm.net wrote:

 From: James Y Knight f...@fuhm.net
 Subject: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
 To: Web SIG web-sig@python.org
 Date: Thursday, November 26, 2009, 8:42 PM
 I move to bless mod_wsgi's definition
 of WSGI 1.1 [1] as the official definition of WSGI 1.1,
 which describes how to implement WSGI adapters for both
 Python 2.x and 3.x. It may not be perfect, but, it's been
 implemented twice, and seems ot have no fatal flaws (it
 doesn't do any lossy transforms, so any issues are
 irritations at worst). The basis for this definition is also
 described in the WSGI 1.0 Ammendments [2] page.
 
 The definitions as they stand are clear enough to
 understand and implement, but not currently in spec-worthy
 language. (e.g. it says should and may in a colloquial
 fashion, but actually means MUST in some places and SHOULD
 in others, as defined by RFC 2119)
 
 Thus, I'd like to suggest that Graham (if he's willing?)
 should reformat the Definition/Ammendments as an actual
 diff against the current PEP 333. Then, I will recommend
 adopting that document as an actual standard WSGI 1.1, to
 replace PEP 333. 
 
 This discussion has gone on long enough, and it doesn't
 really matter as much to have the perfect API, as it does to
 have a standard.
 
 James
 
 [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X
 [2] http://www.wsgi.org/wsgi/Amendments_1.0
 
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: 
 http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com
 
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-27 Thread P.J. Eby

At 08:42 PM 11/26/2009 -0500, James Y Knight wrote:
I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the 
official definition of WSGI 1.1, which describes how to implement 
WSGI adapters for both Python 2.x and 3.x. It may not be perfect, 
but, it's been implemented twice, and seems ot have no fatal flaws 
(it doesn't do any lossy transforms, so any issues are irritations 
at worst). The basis for this definition is also described in the 
WSGI 1.0 Ammendments [2] page.


The definitions as they stand are clear enough to understand and 
implement, but not currently in spec-worthy language. (e.g. it says 
should and may in a colloquial fashion, but actually means MUST 
in some places and SHOULD in others, as defined by RFC 2119)


Thus, I'd like to suggest that Graham (if he's willing?) should 
reformat the Definition/Ammendments as an actual diff against 
the current PEP 333. Then, I will recommend adopting that document 
as an actual standard WSGI 1.1, to replace PEP 333.


I'm +1, with a few caveats.  First, as you mention, it needs to be 
spec'd properly.  In particular, it should be clarified that the main 
changes are to *allow byte strings* in certain places where WSGI 1.0 
demands a unicode string w/latin-1 encoding.


Second, I do not think that the additional guarantees/requirements 
can be safely added to a 1.x version, as they make it impossible for 
an app to tell whether it's *really* running under 1.1 or under a 
broken piece of middleware that's passing through wsgi.version but 
not actually providing 1.1-level guarantees.  I would therefore 
suggest that these additional guarantees and requirements be deferred 
to WSGI 2.0.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-27 Thread P.J. Eby

At 12:34 PM 11/27/2009 -0500, James Y Knight wrote:

On Nov 27, 2009, at 10:20 AM, P.J. Eby wrote:
 Second, I do not think that the additional 
guarantees/requirements can be safely added to a 1.x version, as 
they make it impossible for an app to tell whether it's *really* 
running under 1.1 or under a broken piece of middleware that's 
passing through wsgi.version but not actually providing 1.1-level 
guarantees.  I would therefore suggest that these additional 
guarantees and requirements be deferred to WSGI 2.0.


Okay, let's look at these additional requirements in more detail. I 
see 4 that should be kept, 1 that can be dispensed with, and 1 I'm 
not sure about.


I agree with 2 of your keeps, and remain -0.5 to -1 on the 
others.  See below...



 1. The 'readline()' function of 'wsgi.input' may optionally take 
a size hint.


Already de-facto required. Leaving it out helps no-one. KEEP.


Fair enough, since it's a MAY.  On the other hand, because it's a 
MAY, it actually *helps* no-one, from a spec compatibility 
POV.  (That is, you have to test whether it's available, so it's no 
different than it not being in the spec to begin with.)


So, putting it in doesn't *hurt*, but neither does it *help*...  so I 
lean towards leaving it to 2.x, where it can actually help.



 2. The 'wsgi.input' must provide an empty string as end of input 
stream marker.


I don't think this will be a problem. What would WSGI middleware do 
to break this requirement?


It could be reading the original input stream, and replacing it with 
another one.  Not very common I would guess, but it's still possible 
for a piece of perfectly valid 1.0 middleware to fail this 
requirement for 1.1, leading to the condition where you really can't 
tell if you're running valid 1.1 or not.



It was only put in in the first place so that CGI adapters could 
pass through their input stream (which may not ever provide an EOF) 
without having to wrap it. I agree that was a mistake, and should be 
corrected.


I agree...  but only in 2.x.


 3. The size argument to 'read()' function of 'wsgi.input' would 
be optional and if not supplied the function would return all 
available request content. Thus would make 'wsgi.input' more file 
like as the WSGI specification suggests it is, but isn't really per 
original definition.


This one could be a problem with middleware, and that feature 
shouldn't ever be used, in any case: reading into memory an 
arbitrary amount of data from a client is not a good thing to encourage. OMIT.


Agreed -- even in 2.x it's questionable if not harmful.


 4. The 'wsgi.file_wrapper' supplied by the WSGI adapter must 
honour the Content-Length response header and must only return from 
the file that amount of content. This would guarantee that using 
wsgi.file_wrapper to return part of a file for byte range requests would work.


Given item #6, I suppose this is actually just a matter of 
efficiency, in case the file wrapper is sent to a middleware rather 
than directly to the wsgi gateway? If it goes directly to the 
gateway, that can of course stop reading by itself. ?undecided?


I don't really see how this one helps anything in 1.x, and so lean 
towards leaving it out.



 5. Any WSGI application or middleware should not return more data 
than specified by the Content-Length response header if defined.


As long as this is meant as SHOULD, that's fine. It's not actually 
a requirement, but rather a suggestion of best practices. KEEP.


 6. The WSGI adapter must not pass on to the server any data above 
what the Content-Length response header defines if supplied.


This is already required by HTTP. If the WSGI gateway doesn't make 
this happen somehow, it's generating invalid HTTP and that's a bug. 
Okay to clarify in the spec to ensure people don't miss the 
requirement when implementing. KEEP.


Good points - I agree with these two, and they can be considered 1.0 
clarifications as well.  After the first four (which I see no reason 
to include) I was probably a little over-inclined to throw these two 
out (especially since I was reading the should above as a must, 
per your proposal).


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-27 Thread Ian Bicking
On Fri, Nov 27, 2009 at 12:20 PM, P.J. Eby p...@telecommunity.com wrote:


  1. The 'readline()' function of 'wsgi.input' may optionally take a size
 hint.

 Already de-facto required. Leaving it out helps no-one. KEEP.


 Fair enough, since it's a MAY.  On the other hand, because it's a MAY, it
 actually *helps* no-one, from a spec compatibility POV.  (That is, you have
 to test whether it's available, so it's no different than it not being in
 the spec to begin with.)

 So, putting it in doesn't *hurt*, but neither does it *help*...  so I lean
 towards leaving it to 2.x, where it can actually help.


I think it was meant to be a must.  The *caller* MAY pass in a size hint,
the implementor MUST implement this optional argument.  This is the de-facto
requirement.


   2. The 'wsgi.input' must provide an empty string as end of input stream
 marker.

 I don't think this will be a problem. What would WSGI middleware do to
 break this requirement?


 It could be reading the original input stream, and replacing it with
 another one.  Not very common I would guess, but it's still possible for a
 piece of perfectly valid 1.0 middleware to fail this requirement for 1.1,
 leading to the condition where you really can't tell if you're running valid
 1.1 or not.


Middleware sometimes does this, but any time it does this it always replaces
the input stream with something truly file-like, e.g., StringIO or a temp
file.  Nothing but servers really hands sockets around, and sockets are the
only objects I'm aware of that don't act quite like a file.


 It was only put in in the first place so that CGI adapters could pass
 through their input stream (which may not ever provide an EOF) without
 having to wrap it. I agree that was a mistake, and should be corrected.


 I agree...  but only in 2.x.



   3. The size argument to 'read()' function of 'wsgi.input' would be
 optional and if not supplied the function would return all available request
 content. Thus would make 'wsgi.input' more file like as the WSGI
 specification suggests it is, but isn't really per original definition.

 This one could be a problem with middleware, and that feature shouldn't
 ever be used, in any case: reading into memory an arbitrary amount of data
 from a client is not a good thing to encourage. OMIT.


 Agreed -- even in 2.x it's questionable if not harmful.


Well, we need a way to handle content of unknown length, but if the file
terminates with '' then this isn't that important.

  4. The 'wsgi.file_wrapper' supplied by the WSGI adapter must honour the
 Content-Length response header and must only return from the file that
 amount of content. This would guarantee that using wsgi.file_wrapper to
 return part of a file for byte range requests would work.

 Given item #6, I suppose this is actually just a matter of efficiency, in
 case the file wrapper is sent to a middleware rather than directly to the
 wsgi gateway? If it goes directly to the gateway, that can of course stop
 reading by itself. ?undecided?


 I don't really see how this one helps anything in 1.x, and so lean towards
 leaving it out.


I don't really understand this either, unless it was handling range
responses as well.  Content-Length alone isn't very interesting in this
case.

  5. Any WSGI application or middleware should not return more data than
 specified by the Content-Length response header if defined.

 As long as this is meant as SHOULD, that's fine. It's not actually a
 requirement, but rather a suggestion of best practices. KEEP.

  6. The WSGI adapter must not pass on to the server any data above what
 the Content-Length response header defines if supplied.

 This is already required by HTTP. If the WSGI gateway doesn't make this
 happen somehow, it's generating invalid HTTP and that's a bug. Okay to
 clarify in the spec to ensure people don't miss the requirement when
 implementing. KEEP.


 Good points - I agree with these two, and they can be considered 1.0
 clarifications as well.  After the first four (which I see no reason to
 include) I was probably a little over-inclined to throw these two out
 (especially since I was reading the should above as a must, per your
 proposal).


In this context, maybe 4 is just an extension of these?  Put 4 after 6 and
maybe it'll seem more obvious...?

-- 
Ian Bicking  |  http://blog.ianbicking.org  |
http://topplabs.org/civichacker
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-27 Thread Graham Dumpleton
Please ensure you have also all read:

http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html

I will post again later in detail when have some time to explain a few
more points not mentioned in that post and where people aren't quite
understanding the reasoning for doing things.

One very quick comment about read().

Allowing read() with no argument is no different to a user saying
read(environ['CONTENT_LENGTH']). Because a WSGI adapter/middleware is
going to have to track bytes read to ensure can return an empty string
as end sentinel, it will know length remaining and would internally
for read() with no argument do read(remaining_bytes). As such no real
differences in inefficiencies as far as memory use goes for
implementing read() because of need to implement end sentinel.

Also, you have concerns about read() with no argument, but frankly
readline() with no argument, which is already required, is much worse
because you cant really track bytes read and just read to end of
input. This is because they only want to read to end of line and so
reading all input is going to blow out memory use unreasonably as you
speculate for read(). As such, a readline() implementation is likely
to read in blocks and internally buffer where read() doesn't
necessarily have to.

It may also be pertinent to read:

http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html

as from memory it talks about issues with not paying attention to
Content-Length on output filtering middleware as well.

As I said, will reply later when have some time to focus. Right now I
have a 2 year old to keep amused.

Graham

2009/11/27 James Y Knight f...@fuhm.net:
 I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official 
 definition of WSGI 1.1, which describes how to implement WSGI adapters for 
 both Python 2.x and 3.x. It may not be perfect, but, it's been implemented 
 twice, and seems ot have no fatal flaws (it doesn't do any lossy transforms, 
 so any issues are irritations at worst). The basis for this definition is 
 also described in the WSGI 1.0 Ammendments [2] page.

 The definitions as they stand are clear enough to understand and implement, 
 but not currently in spec-worthy language. (e.g. it says should and may 
 in a colloquial fashion, but actually means MUST in some places and SHOULD in 
 others, as defined by RFC 2119)

 Thus, I'd like to suggest that Graham (if he's willing?) should reformat the 
 Definition/Ammendments as an actual diff against the current PEP 333. 
 Then, I will recommend adopting that document as an actual standard WSGI 1.1, 
 to replace PEP 333.

 This discussion has gone on long enough, and it doesn't really matter as much 
 to have the perfect API, as it does to have a standard.

 James

 [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X
 [2] http://www.wsgi.org/wsgi/Amendments_1.0

 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: 
 http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-11-26 Thread James Y Knight
I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official 
definition of WSGI 1.1, which describes how to implement WSGI adapters for both 
Python 2.x and 3.x. It may not be perfect, but, it's been implemented twice, 
and seems ot have no fatal flaws (it doesn't do any lossy transforms, so any 
issues are irritations at worst). The basis for this definition is also 
described in the WSGI 1.0 Ammendments [2] page.

The definitions as they stand are clear enough to understand and implement, but 
not currently in spec-worthy language. (e.g. it says should and may in a 
colloquial fashion, but actually means MUST in some places and SHOULD in 
others, as defined by RFC 2119)

Thus, I'd like to suggest that Graham (if he's willing?) should reformat the 
Definition/Ammendments as an actual diff against the current PEP 333. Then, 
I will recommend adopting that document as an actual standard WSGI 1.1, to 
replace PEP 333. 

This discussion has gone on long enough, and it doesn't really matter as much 
to have the perfect API, as it does to have a standard.

James

[1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X
[2] http://www.wsgi.org/wsgi/Amendments_1.0

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com