Re: [Web-SIG] WSGI for Python 3

2010-07-17 Thread Bill Janssen
Chris McDonough  wrote:

> On Sat, 2010-07-17 at 01:33 +0200, Armin Ronacher wrote:
> > Hi,
> > 
> > On 7/17/10 1:20 AM, Chris McDonough wrote:
> >  > Let me know if I'm missing something.
> > The only thing you miss is that the bytes type of Python 3 is badly 
> > supported in the stdlib (not an issue if we reimplement everything in 
> > our libraries, not an issue for me) and that the bytes type has no 
> > string formattings which makes us do the encode/decode dance in our own 
> > implementation so of the missing stdlib functions.
> 
> This is why the docs mention "bytes with benefits" instead (like the
> Python 2 "str" type). The existence of such a type would be the result
> of us lobbying for its inclusion into some future Python 3, or at least
> the result of lobbying for a String ABC that would allow us to define
> our own.

I think the most effective way to lobby here would be to provide the
String ABC and an implementation of "encoded strings", i.e. strings with
an internal representation that's a byte sequence in a particular
encoding.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2

2009-08-04 Thread Bill Janssen
P.J. Eby  wrote:

> At 02:28 PM 8/4/2009 +1000, Graham Dumpleton wrote:
> >2009/8/4 P.J. Eby :
> > > I'm not clear on your logic here.  If I request foo/bar/baz (where baz
> > > actually has an accent over the 'a') in latin-1 encoding, and 
> > foo/bar is the
> > > script, then the (accented) baz is legitimate for pass-through to the
> > > application, no?
> >
> >Technically, but what I am pointing out is that Apache pretty well
> >says that foo/bar needs to be UTF-8.
> 
> Which doesn't change the fact that you haven't yet proposed what a 
> WSGI server should *do* with such non-UTF8 bytes in PATH_INFO and 
> QUERY_STRING.  Apache can and does pass through such bytes, so the 
> spec needs to say what we do with them.

Particularly QUERY_STRING.  The original thinking around urlencoded was
that it was always Latin-1.  You were supposed to use
"multipart/form-data" for non-Latin-1 encodings.  Long thread on
www-talk circa 1994 about this.

I think bytes are the safest way to go here.  It would be nice if we
could automagically detect the correct encoding, but there's no
foolproof way of doing that.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Announcing bobo

2009-06-17 Thread Bill Janssen
Jim Fulton  wrote:

> I'm working on another project, bozo, to facilitate using bobo
> resources in Zope and use Zope components with bobo applications.

Good names, Jim.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Python 3.0 and WSGI 1.0.

2009-04-01 Thread Bill Janssen
Alan Kennedy  wrote:

> Hi Bill,
> 
> [Bill]
> > I think the controlling reference here is RFC 3875.
> 
> I think the controlling references are RFC 2616, RFC 2396 and RFC 3987.

I see what you're saying, but it's darn near impossible, as a practical
matter, to get any guidance on encoding matters from those.

The question is where those names come from, and they come from CGI, and
that is (practically speaking) defined these days by RFC 3875, as much as
anything.

> I think the question is "are people using IRIs in the wild"? If so,
> then we must decide how do we best deal with the problems of
> recognising iso-8859-1+rfc2037 versus utf-8, or whatever
> server-configured encoding the user has chosen.

See http://bugs.python.org/issue3300, where we went around and around
that question.  The answer seems to be, yes.

There are lots of useful fragments in that discussion, for instance:

``For the authority (server name) portion of a URI, RFC 3986 is
pretty clear that UTF-8 must be used for non-ASCII values (assuming, for
a moment, that IDNA addresses are not Punycode encoded already). For
the path portion of URIs, a large-ish proportion of them are, indeed,
UTF-8 encoded because that has been the de facto standard in Web browsers
for a number of years now. For the query and fragment parts, however,
the encoding is determined by context and often depends on the encoding
of some page that contains the form from which the data is taken. Thus,
a large number of URIs contain non-UTF-8 percent-encoded octets.''

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Python 3.0 and WSGI 1.0.

2009-04-01 Thread Bill Janssen
Guido van Rossum  wrote:

> On Wed, Apr 1, 2009 at 5:18 AM, Robert Brewer  wrote:
> > Good timing. We had been thinking to make everything strings except for
> > SCRIPT_NAME, PATH_INFO, and QUERY_STRING, since these few are pulled
> > from the Request-URI, which may be in any encoding. It was thought that
> > the app would be best-qualified to decode those three.
> 
> Argh. The *meaning* of these fields is clearly text.

I wouldn't read too much into those names -- they were chosen when the
CGI spec was just gestating, long before the usage patterns solidified,
and don't necessarily reflect the usage of the data bound to them.  I
believe this work was done before the formal IETF definition of a URL,
for instance.

I think the controlling reference here is RFC 3875.

It's not at all clear to me what the SCRIPT_NAME is.  Is it a pathname,
involving the local file system's filenames, which recent discussions
seem to indicate may or may not correspond to human-notional strings, or
a URI path?  I'm OK with calling it text, with a proviso that there may
be cases where it's not.

I've never actually seen a CGI call with PATH_INFO set; I think it's
obsolete usage (but pretty clearly a string).  RFC 3875 says, "Similarly,
treatment of non US-ASCII characters in the path is system-defined."

QUERY_STRING -- should always be an ASCII string.  May indeed encode
non-Unicode strings or purely binary data, but when passed to the CGI
script, it's still encoded as it was in the URI.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] invalid http.cookiejar test?

2008-08-08 Thread Bill Janssen
I'm looking at the following test case in test_http_cookiejar, in the
Python 3000 test suite.  It seems to assume that URLs can have
non-ASCII characters in them, and in fact random octets.  My understanding
of RFC 3986, the URI RFC, is that the URI must be all ASCII characters,
though segments of the path, and parameters, and query elements, can all
contain percent-encoded octets for various purposes.

So I'd expect this test, which to my eyes contains one valid URL (the
first one), and two invalid ones, to fail when it encounters the
invalid URLs, but clearly the intent of whoever wrote this was that it
would succeed.  Which is the correct expectation?

Bill

def test_url_encoding(self):
# Try some URL encodings of the PATHs.
# (the behaviour here has changed from libwww-perl)
c = CookieJar(DefaultCookiePolicy(rfc2965=True))
interact_2965(c, "http://www.acme.com/foo%2f%25/%3c%3c%0Anew%E5/%E5";,
  "foo  =   bar; version=   1")

[no problem so far...]

cookie = interact_2965(
c, "http://www.acme.com/foo%2f%25/<<%0anew\345/\346\370\345",
'bar=baz; path="/foo/"; version=1');

[..."cookie" should be None here, because URL was invalid...]

version_re = re.compile(r'^\$version=\"?1\"?', re.I)

[...so the following assertion should fail...]

self.assert_("foo=bar" in cookie and version_re.search(cookie))

cookie = interact_2965(
c, "http://www.acme.com/foo/%25/<<%0anew\345/\346\370\345")

[...and this assertion should succeed, because of the invalid URL...]

self.assert_(not cookie)

# unicode URL doesn't raise exception

[Um, why shouldn't this next bit raise an exception?]

cookie = interact_2965(c, "http://www.acme.com/\xfc";)

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> Common practice is by now long established, and cannot simply be  
> changed 10 years after the fact to conform to what the standard says  
> it should've been. Therefore, it *is* now a problem with the standard:  
> the standard is wrong. If you follow it, you're going to create  
> totally broken software.
> 
> For instance, treating form posts as being 7bit unless they have a  
> Content-Transfer-Encoding. The RFC says you should do that. But it's  
> an absolutely nonsensical thing to do. Your code would not work with  
> any existing web browser if you did. Or, if you're writing a web  
> browser: don't even think of using Content-Transfer-Encoding to encode  
> your response. Few servers/frameworks would understand your submission  
> if you tried.

I had lots of various charset errors with UpLib, as people tried
various broken browsers, because I was trying to guess "common
practice" and follow it.  Until I actually read the RFCs and made the
server follow them.  Now that it does, almost all of those errors have
gone away.  So, my experience seems to differ from yours.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> Also I'd say that if you're dealing with text (text/*) and no
> charset is provided (or the caller hasn't given an override
> default charset); then you must assume US-ASCII.  And
> you should allow any UnicodeDecodeErrors to bubble
> up to the caller.  In other words if a user agent sent text
> in ISO-8859-x and didn't say it was doing so, then an
> error should be raised when non-ASCII data is seen.

Yep.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> I first try the content-type header,

Right.

> then the special _charset_ field, 

I don't know what that is.  Can you explain a bit more?

> and finally utf-8.

That's wrong.  Should be ASCII.  You could add an "encoding" field to
let the application override this, though.  But the default is ASCII.

> If there is a problem in the decoding, the client is broken (or there is 
> a bug in the application).
> So the correct response is Bad Request, IMHO.

Yes, I think that's right.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> I would think it most useful if the decoding framework would strictly
> follow the RFC and assume "text/plain; charset=US-ASCII"; but
> also allow the caller some means of indicating a different default.
> Obviously, if a user agent does provide a complete Content-Type,
> it should be used.

Yes, I agree.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> > Ok with theory.
> > But in practice:
> 
> Seems like you're looking at a broken browser there.

Ah, I see that the Firefox people, at least, are aware that this is a
bug in Firefox:

https://bugzilla.mozilla.org/show_bug.cgi?id=116346

But they haven't found a fix for it yet, because of the large number
of badly implemented server frameworks that are out there.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> Ok with theory.
> But in practice:

Seems like you're looking at a broken browser there.

Can anyone point to where a W3C standard or IETF RFC describes this
behavior?

> I think that it is safe to decode data from the QUERY_STRING and POST=20
> data to Unicode, and to return Bad Request in case of errors.

It's clearly not safe to do so generally.  If you do decide to do
this, please tell me what framework you're building so that I can
avoid it :-).

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Bill Janssen
> > That's probably wrong.  We went through this recently on the
> > python-dev list.  While it's possible to tell the encoding of
> > multipart/form-data, 
> 
> With multipart/form-data the problem should be the same.
> The content type is defined only for file fields.

Actually, it's defined for all fields, isn't it?  From RFC 2388:

``As with all multipart MIME types, each part has an optional
"Content-Type", which defaults to text/plain.''

So the type is "text/plain" unless it says something else.  And,
according to RFC 2046, the default charset for "text/plain" is
"US-ASCII".

> Can you point me to the discussion on python-dev list?

See http://mail.python.org/pipermail/python-dev/2008-July/081013.html
and the subsequent conversation.

And http://mail.python.org/pipermail/python-dev/2008-July/081066.html
and the reply to that.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-28 Thread Bill Janssen
> In wsgix I use utf-8 for decoding the QUERY_STRING, and the charset 
> specified in the POST'ed data (utf-8 or the charset found in the special 
> _charset_ field).

That's probably wrong.  We went through this recently on the
python-dev list.  While it's possible to tell the encoding of
multipart/form-data, the query_string and x-www-form-urlencoded data
may be in arbitary character set encodings (see RFC 3986).  It's
probably best to not try to map them to strings; instead, return byte
arrays for the value, and only return strings for data that can be
correctly decoded.  Otherwise, you lose information that the app
cannot recover.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-28 Thread Bill Janssen
> The first parse the query string and return a dictionary of strings, the
> latter parse the application/x-www-form-urlencoded client body and
> return a dictionary of strings and the charset used by the client for
> the unicode encoding.

> Now, I'm thinking if these two function should instead return Unicode
> strings instead of plain strings.

I'd say, yes.  I do this in my framework, which also decodes query
strings and post bodies (and handles multipart/form-data as well as
x-www-form-urlencoded).  Note that while x-www-form-urlencoded is
generally restricted to ASCII values by the HTML 4.01 spec,
multipart/form-data can contain arbitrary Unicode strings.

In Python 3.x, strings are all Unicode.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] some much-deferred admin of web-sig list...

2008-07-28 Thread Bill Janssen
I've just cleared the queue of admin tasks for the Web-SIG list, so
don't be surprised to see some old messages appear...

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-03-10 Thread Bill Janssen
> Is it time there was a JSON codec included in the python standard library?

Great idea.  In fact, I'd support including a whole ECMAscript
interpreter module, much as we have XML parsers.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Dealing with urllib, urllib2, and urlparse

2008-02-20 Thread Bill Janssen
> It has been suggested by Fred on the stdlib-sig that urllib should
> just be tossed in favor of urllib2 since most people probably just use
> urlopen() and that is mostly compatible between the two. What do
> people think of that idea?

A quick grep shows that I use "quote", "quote_plus", "unquote", and
"unquote_plus" from urllib.  Not sure how representative that is, but
they should at least be preserved in urllib2.  By the way, shouldn't
the name "urllib" be used for "urllib2", if "urllib" is tossed?

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Removal of Cookie in Python 3.0 OK?

2008-02-04 Thread Bill Janssen
> I think most web frameworks use setuptools at this point.  I'd rather  
> get this as a distribution, rather than from the standard library.  In  
> fact, I'd prefer to see all web-development libraries distributed  
> separate from the language in Python 3.

Jim, you want to have most things separate, if I've read your recent
posts to the dev and 3k lists correctly.  Lean and mean Python
distribution.  I'd agree with you if we had the infrastructure for it,
something which would function at least as well as apt-get does,
pulling dependencies and doing platform and version checks
automatically.  But I don't think Python is anywhere near that level
of infrastructure, and it's a bit of a stretch just to maintain the
infrastructure that currently exists.  Given that, I think that moving
functionality out of the standard library would damage Python, not
improve it.  Given that, I'd rather see what's in the stdlib be
updated to best-of-breed, instead of
the-first-thing-we-thought-of-in-1995, as all too much of it is.

Bill

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] reorg of web-related modules for Python 3K

2008-02-04 Thread Bill Janssen
> I think WSGI is a better interface than any of these.  BaseHTTPServer is 
> a reasonable basis for building a server (wsgiref.simple_server and 
> other's use it), but the subclasses are a little funky IMHO.  Giving 
> them the name http.server makes them seem like the Right Solution, and I 
> don't think they are.  They're more like server-building tools.

Yes, these classes are quite old, and have been updated only patchily
over the years.  I don't use them, either.  But I guess the question
is whether wsgiref.* is a better _implementation_ than any of these.
We don't really have interfaces in Python.

Bill



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] more work on httplib?

2008-02-03 Thread Bill Janssen
>   utility routines for client-side form manipulation:
> encode_multipart_formdata, http_post_multipart, https_post_multipart

I should point out that these are elaborations of Wade Leftwich's
Python Cookbook recipes.

>   cookie readers for Firefox and Safari cookie file formats

I'm still restricting myself to Python 2.3, so I haven't really looked
at cookielib.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] more work on httplib?

2008-02-03 Thread Bill Janssen
I've been working on a personal digital library server, written in
Python, built on top of Medusa, now in beta test at
http://uplib.parc.com/.  We're releasing it under the GPLv2 (actually,
have already released it to our beta testers -- if you'd like to join
the fun, just create an account on the blog).

As part of the system, I had to write a number of extensions to the
core library's HTTP and HTML support, including

  versions of httplib.HTTP and HTTPSConnection that verify the server's
certificates
  htmlescape(), a version of cgi.escape() that quotes HTML correctly
  utility routines for client-side form manipulation:
encode_multipart_formdata, http_post_multipart, https_post_multipart
  a list of defined HTTP status codes, by name
  a version of urllib.urlretrieve() that handles cookies, proxies,
and redirects (I think this could be written as a urllib2 Opener)
  cookie readers for Firefox and Safari cookie file formats
  a web site caching function that fetches all ancillary material (CSS,
ECMAscript, images, etc) and links it in properly, essentially
creating what Mozilla calls a "Web Page Complete" version

Not to mention the new SSL module.  I found it irritating that I had
to write all of this myself, instead of just pulling it from the
standard library.  Now that it's released, what's already in the
standard library (that I just didn't know about :-)?  And which items
should I file bug reports on?

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] reorg of web-related modules for Python 3K

2008-02-03 Thread Bill Janssen
Over on the stdlib-sig, Brett's proposing that we move some of the
HTTP-related classes:

> OK, to keep this ball rolling, here is my suggestion for reorganizing
> HTTP modules:
>
>   httplib -> http.tools
>   BaseHTTPServer -> http.server
>   SimpleHTTPServer -> http.server
>   CGIHTTPServer -> http.server
>   cookielib -> http.cookies
>
> Since the various HTTP server modules have no name clashes we
> can consolidate them into a single module.

Seems reasonable to me, but I thought it should be looked at in this
forum.  All this is going into PEP 3108, so either join the stdlib-sig,
or read the PEP, if you care about all this.

Alexandre Vassalotti further proposes the following:

> xmlrpclib -> xmlrpc.tools
> SimpleXMLRPCServer -> xmlrpc.server
> DocXMLRPCServer -> xmlrpc.server

Personally, I'd put those under "http.", or maybe "http.xmlrpc.".

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] daemon tools

2007-03-08 Thread Bill Janssen
> For symmetry's sake in Windows a Python service manager could simply
> use SCManager API under the hood (through win32all) to get the job done,
> still keeping a consistent cross-platform modus operandi.

That's what I do in UpLib.  Works pretty well.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] cleaning up the standard library's Web support

2007-01-08 Thread Bill Janssen
There's a thread going on on the Python-3000 list about PEP 3108,
which proposes a clean-up/re-org for the standard library.

I've suggested that, analogous to the "email" package, a "web" package
be created, and most (all?) of the web-related modules be moved under
it.  This is also a chance to remove cruft and combine related modules
(urllib.py and urllib2.py, for instance).

To take another example, should BaseHTTPServer and SimpleHTTPServer
both exist?  Shouldn't SimpleHTTPServer.SimpleHTTPRequestHandler just
be another class defined in BaseHTTPServer?  Or should
SimpleHTTPServer just be deleted altogether?

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI Components Mailing List

2006-10-18 Thread Bill Janssen
> I am happy to direct these conversations to
> wherever folks want. Is this the place, after all?

You bet!  Let's keep things here, till folks complain.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI -- usable for other protocols?

2006-10-17 Thread Bill Janssen
Well, I'll definitely check out Twisted's IMAP4 code.  Thanks!

> Quotient uses a SQLite database for storage of structured data about
> messages and a filesystem structure (currently not a great structure,
> but it's fixable) for actual message files.

I was sort of planning on keeping all the message metadata in the Lucene DB.

MH uses a filesystem structure too.  Maybe there's hope.

> It supports per-user filtering rules (although not procmail based - and
> the work done in this area so far is extremely minimal, basically it can
> do substring matching on headers - expanding this would be pretty simple
> though, Quotient is designed for this kind of thing).

This doesn't sound too far from what I intended, actually.

I'd like to keep the Lucene index in memory, and don't particularly
want the overhead of process swaps, so I'd like to be able to use them
together in a single address space.  It sounds like you've worked out
most of the issues with IMAP4, so I'll take a closer look.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI -- usable for other protocols?

2006-10-17 Thread Bill Janssen
> might I suggest you contribute
> to a project which sounds roughly equivalent to the one you're describing?
> 
> http://divmod.org/trac/wiki/DivmodQuotient

Just for fun, I grepped the sources for IMAP.  No hits.

Seems like I'd spend more time understanding the framework system
you're using than it would take me to write it from scratch.  An IMAP
server isn't hard.  And I don't think the project is all that equivalent.

Does Twisted support the use of PyLucene?

I basically want an IMAP server that supports the MH mail storage
format, uses Lucene for indexing and search, and has the ability to do
auto-filtering on a per-user basis with either MH procmail scripts or
a Python script that uses a particular API.  I don't need an SMTP
server, I don't need a Web interface to mail.

If DivmodQuotient is anywhere close to that, I'll take a longer look.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI -- usable for other protocols?

2006-10-17 Thread Bill Janssen
> You'd also need a WSGI server that handled IMAP and persistent 
> connections.  So maybe another server is called for, or an adaptation of 
> an existing multi-protocol server.

That's my tentative conclusion.  The WSGI handling doesn't really
match the IMAP connection requests very well.  I figured I'd adapt
Medusa for this, again; set up an HTTP handler and an IMAP handler.

But I thought I'd check the wisdom of the crowd, first.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI -- usable for other protocols?

2006-10-17 Thread Bill Janssen
I've been working on Python IMAP server that uses PyLucene for
indexing.  It's mainly IMAP, but also speaks a bit of HTTP for an
administrative interface.  Does it make any sense to wrap it with
WSGI?  That is, does WSGI make sense for other protocols than HTTP
(specifically IMAP)?

And, what WSGI-supporting environments will also support PyLucene (the
limiting factor is that the GCJ runtime has to be linked in, and all
threads must be GCJ threads).

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] Adding wsgiref to stdlib

2006-04-29 Thread Bill Janssen
> Perhaps this could go in Demo/wsgiref/?

Perhaps both Ian's and Phillip's examples could go into Demo/wsgiref/?

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Adding wsgiref to stdlib

2006-04-29 Thread Bill Janssen
> It still looks like an application of WSGI, not part of a reference
> implementation.

It seems to me that canonical exemplars are part of what a "reference"
implementation should include.  Otherwise it would be a "standard"
implementation, which is considerably different.

Bill

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] Adding wsgiref to stdlib

2006-04-28 Thread Bill Janssen
> I'm inviting people to discuss the addition of wsgiref to the standard
> library. I'd like the discussion to be finished before a3 goes out;

+1.

I think it's faily low-risk.  WSGI has been discussed and implemented
for well over a year; there are many working implementations of the
spec.  Adding wsgiref to the stdlib would help other implementors of
the spec.

I think there should be a better server implementation in the stdlib,
but I think that can be added separately.  (Personally, I'd like to
find the time to (a) make Medusa thread-safe, and (b) add WSGI to it.
If anyone would like to help with that, send me mail.)

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Fwd: Summer of Code preparation]

2006-04-19 Thread Bill Janssen
> But the X-windows people weren't designing for Internet scale: how
> many connections should a server be able to handle?

Well, that's where HTTP-ng came in.  In particular, look at the WebMUX
document, http://www.w3.org/Protocols/MUX/WD-mux-980722.html, which we
implemented for ILU.  I often wish I had it around today for AJAX...

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Fwd: Summer of Code preparation]

2006-04-17 Thread Bill Janssen
> > It's getting very hard to do good Web-page-understanding without a
> > javascript interpreter.  Ideally, this would execute in a Python
> > context so that each Javascript call (or statement, or expression
> > evaluation) could invoke Python code to do introspection over the
> > activity.
> 
> Do you mean like implementing the DOM in Python, and providing DOM 
> objects to Javascript?  Or actually watching the Javascript execute at 
> some level?

I meant actually watching the Javascript execute, though of course
providing the DOM as Python objects would be, as you say, a bare
minimum.  One way to this would be to translate the Javascript into
Python, and use existing introspection hooks to watch the Python
execute.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Fwd: Summer of Code preparation]

2006-04-17 Thread Bill Janssen
Ian,

It's getting very hard to do good Web-page-understanding without a
javascript interpreter.  Ideally, this would execute in a Python
context so that each Javascript call (or statement, or expression
evaluation) could invoke Python code to do introspection over the
activity.

As you say, it's the flip side of Python in the browser.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-19 Thread Bill Janssen
Nice list, Alan.  I think you make a lot of sense here.

Bill

> [Alan Kennedy]
>  >>>Maybe we need a PEP
> 
> [Bill Janssen]
>  >>Great idea!  That's exactly what I thought when I organized this SIG a
>  >>couple of years ago.
> 
> [Guido van Rossum]
>  > At first I was going to respond "+1". But the fact that a couple of
>  > years haven't led to much suggests that it's unlikely to be fruitful;
>  > there just are too many diverging ideas on what is right. (Which makes
>  > sense since it's a huge and fast developing field.)
> 
> Having considered the area for a couple of days, I think you're right: 
> the generic concept "web", as in web-sig, covers far too much ground, 
> and there are too many schools of thought.
> 
>  > So unless someone (Alan Kennedy?) actually puts forward a PEP and gets
>  > it through a review of the major players on web-sig, I'm skeptical.
> 
> But there is a subset which I think is achievable, namely http support, 
> which IMO is the subset that most needs a rework. And now that we have a 
> nice web standard, WSGI, it would be nice to make use of it to refactor 
> the current http support. The following are important omissions in the 
> current stdlib.
> 
>   - Asynchronous http client/server support (use asyncore? twisted?)
>   - SSL support in threaded http servers
>   - Asynchronous SSL support
>   - Simple client file upload support
>   - HTTP header parsing support, e.g. language codes, quality lists, etc
>   - Simple object publishing framework?
> 
> Addressing all of the above would be significant piece of work. And 
> IMHO, it is only achievable by staying focussed on http and NOT 
> addressing requirements such as
> 
>   - Content processing, e.g. html tidy, html parsing, css parsing
>   - Foreign script language parsing or execution
>   - Page templating API
> 
> I think it would be a good idea to address these concerns in separate PEPs.
> 
> [Guido van Rossum]
>  > I certainly don't want this potential effort to keep us from adding
>  > the low-hanging fruit (wsgiref, with perhaps some tweaks as PJE can
>  > manage based on recent feedback here) to the 2.5 stdlib.
> 
> Completely agreed. Any web-related PEPs are going to take a long time, 
> and are unlikely to be ready in time for 2.5.
> 
> Regards,
> 
> Alan.
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/janssen%40parc.com

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-17 Thread Bill Janssen
> What would we be PEPing?

Well, when we started, I made up a list of various things that seem to
be missing in the standard library, like server-side support for
SSL-encrypted socket connections.

http://www.parc.com/janssen/web-sig/needed.html

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-17 Thread Bill Janssen
> - CSS Parser
> 
>I think John Lee has done some work on this?  Beats me.  I've never 
> felt any need for CSS parsing personally.
> 

If you are doing any work with the Web (spidering, for instance), and
need to do rendering of the Web pages (say, for pop-out prism, or
building an ebook from it), a CSS parser is pretty much necessary.  I
tend to think that you really can't understand the Web
programmatically without a CSS parser.

An ECMAscript interpreter would be a big help, too.

> - Asynchronous fetch
> 
>Not sure how this would work.  What kind of async networking can we 
> do in a cross-platform manner?

I ran into this building the Plucker web spider.  You need to be able
to issue requests for web pages (or other resources) and come back
from time to time to see if they had loaded yet.  I think the addition
of generators might be useful here.

> - Connection caching
> 
>Seems somewhat complex.  I'm not sure it is appropriate for the 
> standard library.  Maybe httplib2 or another such project could take 
> this on, but it doesn't seem like a standard library task.

My philosophy of the standard library is that it should keep working
when you lean on it, not break and force you to find something else
when you lean on it.  Apparently (and oddly) that seems to be a
somewhat unusual way of thinking about it.

> - Server-side SSL support
> 
>What is the current state of this?  It's mostly there, isn't it?

Not there at all, as far as I know.  The standard library only
supports client-side SSL.

> - A standard interface to request data
> 
>We have this with WSGI.  It's not a pretty request object, but it's 
> standardized.  I think there's room for a standard object-like interface 
> to the WSGI environment (not comprehensive, but it doesn't need to be).

Yes, I think this is covered by WSGI.

> - A standard server framework on the order of Medusa
> 
>Well, we have BaseHTTPServer, and now a WSGI based server.

So, does either work as well as Medusa?  Surely BaseHTTPServer doesn't.

> - Explicit cookie handling
> 
>I guess I brought this one up?  I'm not sure what I mean ;)

I had to do some client-side cookie handling lately...  There are some
missing pieces, like cookie databases that automatically sync with the
user's browser cookie repositories.

I don't believe there's any support for server-side cookie handling either.

Bill




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-16 Thread Bill Janssen
> Thinking about this some more, it's beginning to sound to me like the
> server-side web support in the standard library needs a proper review
> and possible rework: it's slowly decohering/kipplizing.
> 
> Maybe we need a PEP, so that we can all discuss the subject
> (rationally ;-) and sort out all of the issues before we go ahead and
> commit anything?

Great idea!  That's exactly what I thought when I organized this SIG a
couple of years ago.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-06 Thread Bill Janssen
> Instead, I think the right approach is to continue with the existing 
> approach: put the most basic possible WSGI server in the standard 
> library, for educational purposes only, and a warning that it shouldn't 
> really be used for production purposes.

I strongly disagree with this thinking.  Non-production code shouldn't
go into the stdlib; instead, Alan's proposed module should go onto
some pedagogical website somewhere with appropriate tutorial
documentation.

Bill

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI for Medusa?

2005-12-30 Thread Bill Janssen
If no one has done a WSGI implementation for Medusa, I think I'll take
a shot at it this weekend...

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] JavaScript libraries

2005-05-03 Thread Bill Janssen
>From the Python point of view, it might be interesting to have a
Javascript (isn't it properly called ECMAscript?) interpreter that
functions like one of the Python HTML parsers.  That is, every time
some javascript action happens, there's the opportunity to interpose
some Python code.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI for Medusa?

2005-04-18 Thread Bill Janssen
Anyone done a WSGI module for Medusa yet?

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Re: Just lost another one to Rails

2005-04-15 Thread Bill Janssen
Greg,

> a buddy of mine down in the States was switching
> to Ruby, after using Python for two years, because he and his
> colleagues needed a lightweight, ready-out-of-the-box web app
> framework

We could discuss what lightweight means, I guess, but IMO there is a
light-weight, pure-Python, ready-to-use Python web app framework,
called Medusa.  No other installs necessary -- you don't have to get
some database running unless you really want to.  Works remarkably
well; I've written a few app servers in it already.  Perhaps your
buddy is actually switching for some other reason.

Bill

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Pure Python HTML?

2005-04-13 Thread Bill Janssen
> This works for small scale projects where only a few developers are 
> expected to know the codebase.

Sure.  It was a small scale example.  For larger projects you'd use
more abstraction layers, accessing (for example) template strings via
method calls which would provide the ability to do things like i18n
manipulation.

> The idea of ZCML is for programmers to 
> be able to reconfigure or extend the behavior of other people's code 
> without having to change, or hopefully even fully understand, that code 
> itself.

Sound engineering principles, modularity and abstraction.  Now let's
glue those modules together with Python rather than with XML.

> I don't think this discussion will go anywhere though, as your position 
> seems to be too extreme in this respect to easily move out of. :)

Gosh, I barely have a position on this, really.  I'm just interested,
on this mailing list, in improving ways of helping Python-savvy
engineers provide and use Web services.

Bill

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Re: Just lost another one to Rails

2005-04-13 Thread Bill Janssen
> Buggy? I don't think ZCML is buggy. Where's that coming from?

Sorry, didn't mean to knock ZCML specifically.  I meant to say that
use of XML is inherently buggy when people have to edit it with a text
editor, because of the bad syntax.  I have the same gripe with the XUL
used by Firefox.  Nothing specific to the design of ZCML (which I
haven't even seen).

> And lose 
> interoperability, accessibility by a host of programmers that *don't* 
> know your codebase, and code yourself onto an island.

Sorry, Martijn, none of these arguments have much weight with me.  I'm
interested in improving the ability to do things *with Python*.  And
that doesn't (for me) mean switching to something else, no matter how
many other people understand that "something else".

Bill


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Pure Python HTML?

2005-04-12 Thread Bill Janssen
> I don't know about you, but generating HTML with pure Python code can be
> messy--ONE reason why we introduce templateing languages in the first
> place. Often (not always) the best way to end up with XHTML is to start
> with a valid or almost-valid XML document and then infuse the dynamic
> content.

Indeed.  And in Python I do it with string formatting:

template = """


%(title)s


%(title)s
Author:  %(author)s
something interesting here

"""

dynamic_content = {}
# fill in dynamic content here, or perhaps it's a dict read from a DB
dynamic_content['title'] = 'How to write a Web service'
dynamic_content['author'] = 'Someone Good'

request.reply(template % dynamic_content)


Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Re: Just lost another one to Rails

2005-04-12 Thread Bill Janssen
> The minimal Zope 3 code is a page template and a few lines of ZCML in a 
> Python package with an empty __init__.py to hook up a new view to an 
> existing object (say, a folder). There's no Python code *at all*

>From my point of view, that's the problem.  I don't want to write in
some cumbersome and buggy XML format (which is what I'm guessing ZCML
is) when I could be writing clean Python code.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] A query for hosting providers

2005-03-28 Thread Bill Janssen
> * Long running processes are hard to maintain (assuming we rule out 
> CGI).  Code becomes stale, maybe the server process gets in a bad state. 
>Sometimes processes becomes wedged.  With mod_python this can effect 
> the entire site.

I've been extremely impressed at how well Python's VM does at this.  I
run Medusa-based services for months at a time without trouble -- in
fact, they run fine till the machine is rebooted.  These servers are
doing multithreaded text and graphics manipulation with regular
expressions and PIL.  They often run Linux scripts in subprocesses.
Wedges are extremely rare, and I have yet to see one caused by Python
code.

Bill
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com