Re: [Web-SIG] A 'shutdown' function in WSGI
On Wed, 2012-02-22 at 09:06 +1100, Graham Dumpleton wrote: If you want to be able to control a thread like that from an atexit callback, you need to create the thread as daemonised. Ie. setDaemon(True) call on thread. By default a thread will actually inherit the daemon flag from the parent. For a command line Python where thread created from main thread it will not be daemonised and thus why the thread will be waited upon on shutdown prior to atexit being called. If you ran the same code in mod_wsgi, my memory is that the thread will actually inherit as being daemonised because request handler in mod_wsgi, from which import is trigger, are notionally daemonised. Thus the code should work in mod_wsgi. Even so, to be portable, if wanting to manipulate thread from atexit, make it daemonised. Example of background threads in mod_wsgi at: http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode#Monitoring_For_Code_Changes shows use of setDaemon(). Graham I've read all the messages in this thread and the traffic on the bug entry at http://bugs.python.org/issue14073 but I'm still not sure what to tell people who want to invoke code at shutdown. Do we tell them to use atexit? If so, are we saying that atexit is sufficient for all user-defined shutdown code that needs to run save for code that needs to stop threads? Is it sufficient to define shutdown as when the process associated with the application exits? It still seems to not necessarily be directly correlated. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] A 'shutdown' function in WSGI
On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote: The standard way to do this would be to define an optional server extension API supplied in the environ; for example, a 'x-wsgiorg.register_shutdown' function. Unlikely, AFACIT, as shutdown may happen when no request is active. Even if this somehow happened to not be the case, asking the application to put it in the environ is not useful, as the environ can't really be relied on to retain values up the call stack. - C The wsgi.org wiki used to be the place to propose these sorts of things for standardization, but it appears to no longer be a wiki, so the mailing list is probably a good place to discuss such a proposal. On Mon, Feb 20, 2012 at 2:30 PM, Tarek Ziadé ziade.ta...@gmail.com wrote: oops my examples were broken, should be: def hello_world_app(environ, start_response): status = '200 OK' # HTTP Status headers = [('Content-type', 'text/plain')] start_response(status, headers) return [Hello World] def shutdown(): # or maybe something else as an argument I don't know do_some_cleanup() and: $ gunicorn myapp:hello_world_app myapp:shutdown Cheers Tarek ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/pje% 40telecommunity.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] A 'shutdown' function in WSGI
On Mon, 2012-02-20 at 20:54 -0500, PJ Eby wrote: 2012/2/20 Chris McDonough chr...@plope.com On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote: The standard way to do this would be to define an optional server extension API supplied in the environ; for example, a 'x-wsgiorg.register_shutdown' function. Unlikely, AFACIT, as shutdown may happen when no request is active. Even if this somehow happened to not be the case, asking the application to put it in the environ is not useful, as the environ can't really be relied on to retain values up the call stack. Optional server extension APIs are things that the server puts in the environ, not things the app puts there. That's why it's 'register_shutdown', e.g. environ['x-wsgiorg.register_shutdown'](shutdown_function). I get it now, but it's still not the right thing I don't think. Servers shut down without issuing any requests at all. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] PEP3333 and PATH_INFO
Perrenial topic, it seems, from the archives. As far as I can tell from PEP , every WSGI application that wants to run on both Python 2 and Python 3 and which uses PATH_INFO will need to define a helper function something like this: import sys def decode_path_info(environ, encoding='utf-8'): PY3 = sys.version_info[0] == 3 path_info = environ['PATH_INFO'] if PY3: return path_info.encode('latin-1').decode(encoding) else: return path_info.decode(encoding) Is there a more elegant way to handle this? - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] wsgi server...
Does anyone know of a pure-Python WSGI server that: - Is distributed indepdently from a web framework or larger whole. - Runs on UNIX and Windows. - Runs on both Python 2 and Python 3. - Has good test coverage. - Is useful in production. (I sent this already to the Pylons-discuss maillist and got some good responses, so not ignoring those, just want to ask a wider audience) Thanks! - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
On Sun, 2011-01-02 at 09:21 -0800, Guido van Rossum wrote: Graham, I hope that you can stop being grumpy about the process that is being followed and start using your passion to write up a critique of the technical merits of Alice's draft. You don't have to attack the whole draft at once -- you can start by picking one or two important issues and try to guide a discussion here on web-sig to tease out the best solutions. Please understand that given the many different ways people use and implement WSGI there may be no perfect solution within reach -- writing a successful standard is the art of the compromise. (If you still think the process going forward should be different, please write me off-list with your concerns.) Everyone else on this list, please make a new year's resolution to help the WSGI 2.0 standard become a reality in 2011. I think Graham mostly has an issue with this thing being called WSGI 2. FTR, avoiding naming arguments is why I titled the original PEP Web3. I knew that if I didn't (even though personally I couldn't care less if it was called Buick or McNugget), people would expend effort arguing about the name rather than concentrate on the process of creating a new standard. They did anyway of course; many people argued publically wishing to rename Web3 to WSGI2. On balance, though, I think giving the standard a neutral name before it's widely accepted as a WSGI successor was (and still is) a good idea, if only as a conflict avoidance strategy. ;-) That said, I have no opinion on the technical merits of the new PEP 444 draft; I've resigned myself to using derivatives of PEP forever. It's good enough. Most of the really interesting stuff seems to happen at higher levels anyway, and the benefit of a new standard doesn't outweigh the angst caused by trying to reach another compromise. I'd suggest we just embrace it, adding minor tweaks as necessary, until we reach some sort of technical impasse it doesn't address. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444
PEP 444 has no champion currently. Both Armin and I have basically left it behind. It would be great if you wanted to be its champion. - C On Sun, 2010-11-21 at 03:12 -0800, Alice Bevan-McGregor wrote: (A version of this is is available at http://web-core.org/2.0/pep-0444/ — links are links, code may be easier to read.) PEP 444 is quite exciting to me. So much so that I’ve been spending a few days writing a high-performance (C10K, 10Krsec) Py2.6+/3.1+ HTTP/1.1 server which implements much of the proposed standard. The server is functional (less web3.input at the time of this writing), but differs from PEP 444 in several ways. It also adds several features I feel should be part of the spec. Source for the server is available on GitHub: https://github.com/pulp/marrow.server.http I have made several notes about the PEP 444 specification during implementation of the above, and concern over some implementation details: First, async is poorly defined: If the origin server advertises that it has the web3.async capability, a Web3 application callable used by the server is permitted to return a callable that accepts no arguments. When it does so, this callable is to be called periodically by the origin server until it returns a non-None response, which must be a normal Web3 response tuple. Polling is not true async. I believe that it should be up to the server to define how async is utilized, and that the specification should be clarified on this point. (“Called periodically” is too vague.) “Callable” should likely be redefined as “generator” (a callable that yields) as most applications require holding on to state and wrapping everything in functools.partial() is somewhat ugly. Utilizing generators would improve support for existing Python async frameworks, and allow four modes of operation: yield None (no response, keep waiting), yield response_tuple (standard response), return / raise StopIteration (close the async connection) and allow for data to be passed back to the async callable by the higher-level async framework. Second, WSGI middleware, while impressive in capability, are somewhat… heavy-weight. Heavily nesting function calls is wasteful of CPU and RAM, especially if the middleware decides it can’t operate, for example, GZip compression disabling itself for non-text/ mimetypes. The majority of WSGI middleware can, and probably should be, implemented as linear ingress or egress filters. For example, on-disk static file serving could be an ingress filter, and GZip compression an egress filter. m.s.http supports this filtering and demonstrates one API for such. Also, I am in the process of writing an example egress CompressionFilter. An example API and filter use implementation: (paraphrased from marrow.server.http) # No filters, near 0 overhead. for filter_ in ingress_filters: # Can mutate the environment. result = filter_(env) # Allow the filter to return a response rather than continuing. if result: # result is a status, headers, body_iter tuple return result[0], result[1], result[2] status, headers, body = application(env) for filter_ in egress_filters: # Can mutate the environment, status, headers, body, or # return completely new status, headers, and body. status, headers, body = filter_(env, status, headers, body) return status, headers, body The environment has some minor issues. I’ll write up my changes in RFC-style: SERVER_NAME is REQUIRED and MUST contain the DNS name of the server OR virtual server name for the web server if available OR an empty bytestring if DNS resolution is unavailable. SERVER_ADDR is REQUIRED and MUST contain the web server’s bound IP address. URL reconstruction SHOULD use HTTP_HOST if available, SERVER_NAME if there is no HTTP_HOST, and fall back on SERVER_ADDR if SERVER_NAME is an empty bytestring. CONTENTL_LENGTH is REQUIRED and MUST be None if not defined by the client. Testing explicitly for None is more efficient than armoring against missing values; also, explicit is better than implicit. (Paste’s WSGI1 server defines CONTENT_LENGTH as 0, but this implies the client explicitly declared it as zero, which is not the case.) FRAGMENT and PARAMETERS are REQUIRED and are parsed out of the URL in the same way as the QUERY_STRING. FRAGMENT is the text after a hash mark (a.k.a. “anchor” to browsers, e.g. /foo#bar). PARAMETERS come before QUERY_STRING, and after PATH_INFO separated by a semicolon, e.g. /foo;bar?baz. Both values MUST be empty bytestrings if not present in the URL. (Rarely used — I’ve only seen it in Java and ColdFusion applications — but still useful.) Points of contention: Changing the namespace seems needless. Using the wsgi.* namespace with a wsgi.version of (2, 0) will allow applications to easily
Re: [Web-SIG] PEP 444
On Sun, 2010-11-21 at 09:32 -0800, Alice Bevan-McGregor wrote: PEP 444 has no champion currently. Both Armin and I have basically left it behind. It would be great if you wanted to be its champion. Done. As I already have a functional, performant HTTP server[1] and example filter[2] (compression) utilizing a slightly modified version of PEP 444, and hope to be giving a presentation on its design and related utilities[3] early next year, I’d love to have the opportunity to directly shape its future. My server may be a bit large to be a reference implementation, but until it has its first user I have the benefit of being able to experiment whole-heartedly with features and proposals. Since Python 3 was released I haven’t heard of much forward-progress in getting web frameworks compatible. The largest complaint I’ve heard is that there are too few things already ported, which is a chicken and the egg problem. This is one scenario where re-inventing the wheel may be the only way to see forward movement. So far, I seem to be buckling down and Getting Things Done™ in this regard. How would I go about getting access to the PEP in order to fix the issues I’ve been catching up on? (I’ve been reading through quite a bit of old mailing list traffic these last few hours in-between writing docs and unit tests for the compression egress filter.) Georg Brandl has thus far been updating the canonical PEP on python.org. I don't know how you get access to that. My working copy is at https://github.com/mcdonc/web3 . - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?
On Sun, 2010-10-24 at 17:16 +0200, Georg Brandl wrote: Am 24.10.2010 16:40, schrieb Chris McDonough: On Sun, 2010-10-24 at 10:17 +0300, Armin Ronacher wrote: I have to admit that my interest in Python 3 is not very high and I am most likely not the most reliable person when it comes to driving PEP 444 :) We should probably withdraw the PEP, then (unless someone else wants to step up and champion it), because neither am I. Don't give it up yet -- Deferring is probably the better option. TBH, unless someone has immediate interest in championing it, I'd rather just withdraw it and let someone else resubmit it (or something like it) later if they want. It's just going to cause confusion if it's left in a zombie state without a champion. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?
For what it's worth, I'm happy with the changes made to WSGI 1 that produced PEP . I'm unlikely to champion PEP 444 going forward. It has already served its primary duty to me personally (which was to catalyze the formalization of some specification that is Python 3 inclusive). However, Armin may feel differently about it, so this doesn't constitute a withdrawal of PEP 444. I'm instead just signaling my own personal attitude: don't really care as much now that there's something out there. On Fri, 2010-10-22 at 10:35 +1100, Graham Dumpleton wrote: Any one care to comment on my blog post? http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html As far as web framework developers commenting, Armin at: http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/ has said: Hopefully not. WSGI could do better and there is a proposal for that (444). So, looks he is very cool on the idea. No other developers of actual web frameworks has commented at all on PEP from what I can see. Graham ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
I have some pending changes to the PEP 444 spec (the working copy is at http://github.com/mcdonc/web3/blob/master/pep-0444.rst but please don't consider that canonical in any sense, it will change before an official republication of the proposal). The modifications fold in most of what we've talked about on the list, or at least acknowledge the issues; a change log is contained near the top. However, I'm currently trying work work through what to do about offering up quoted PATH_INFO and SCRIPT_NAME values (unquoted in the sense that, at least on platforms that support it, these would be the original values before being run through urllib.unquote). The current published proposal on Python.org indicates that these would go into web3.path_info and web3.script_name but nobody seems to much like that because it would make things like path_info_pop hard (the code would need to keep two data structures in sync, and would need to be pretty magical in the face of %2F markers). The pending, unpublished proposal turns SCRIPT_NAME and PATH_INFO into *quoted* values, and adds a ``web3.path_requoted`` flag for debugging purposes, which will be True if the SCRIPT_NAME and/or PATH_INFO needed to be recomposed and requoted (eg. on CGI platforms). But private conversations lead me to believe that not many folks will like this either, because it comandeers CGI names that are well-understood to be unquoted. The only sensible way to break the deadlock seems to be to not use any CGI names in the specification at all, so as not to break people's expectations. I know that when I change it to not use any CGI names, it will be received poorly, but I can't think of a better idea. - C On Wed, 2010-09-15 at 19:03 -0400, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 2010-09-16 at 05:29 +0200, Roberto De Ioris wrote: About the *.file_wrapper removal, i suggest a PSGI-like approach where 'body' can contains a File Object. def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = fd return body, status, headers I don't see why this couldn't work as long as middleware didn't convert the body into something not-file-like. But it is really an implementation detail of the origin server (it might specialize when the body is a file), and doesn't really need to be in the spec. or def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = [b'Header', fd, b'Footer'] return body, status, headers This won't work, as the body is required to return an iterable which returns bytes, and cannot be an iterable which returns either bytes or other iterables (it must be a flat sequence). - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sun, 2010-09-19 at 21:52 -0400, Chris McDonough wrote: I'm -0 on the server trying to guess the Content-Length header. It just doesn't seem like much of a burden to place on an application and it's easier to specify that an application must do this than it is to specify how a server should behave in the face of a missing Content-Length. I also believe Graham has argued against making the server guess, I presume this causes him some pain somehow (probably underspecification in WSGI). Graham's issues with requiring the server to set Content-Length are detailed here: http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, 2010-09-17 at 19:47 +0300, Ionel Maries Cristian wrote: I don't like this proposal at all. Besides having to go through the bytes craziness the design is pretty backwards for middleware and asynchronous applications. We've acknowledged in other messages to this thread that the web3.async red herring is speculative, and Armin has indicated that if he does not find a champion willing to create a reference implementation for it today that it will be taken out. This doesn't help async people, but it also doesn't harm them (no difference from WSGI really). Personally, I hope nobody steps up and we just rip it out. ;-) I'm not sure why you characterize using bytes as bytes craziness. We have been using strings as byte sequences in WSGI for over five years. Python itself draws an equivalence between the Python 3 bytes type and Python 2 str (bytes is aliased to str under Python 2). I'm not really sure why we shouldn't take advantage of that equivalence, and why people are so enamored of treating envvar values, headers, and such as text other than the brokenness of the Python 3 stdlib urllib stuff. IMO, WSGI/Web3 isn't really a programming platform (or at least if it is, it is destined to be a pretty crappy one), it's just a connection protocol, so any its more typing or its ugly argument seems pretty thin to me. I'd personally rather have it be more general and less easy to use than potentially broken in some corner case circumstance. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 2010-09-16 at 12:01 -0500, Ian Bicking wrote: Well, reiterating some things I've said before: * This is clearly just WSGI slightly reworked, why the new name? The PEP says Web3 is clearly a WSGI derivative; it only uses a different name than WSGI in order to indicate that it is not in any way backwards compatible. I don't really care what the name is. My experience in various communities suggests that naming the new totally-bw-incompat thing the same as the old thing weakens both the new thing and the old thing, but.. whatever. I just don't care much. * Why byte values in the environ? No one has offered any real reason they are better than native strings. I keep asking people to offer a reason, *and no one ever does*. It's just hyperbole and distraction. Frankly I'm feeling annoyed. So far my experience makes me believe using native strings will make it easier to port and support libraries across 2 and 3. I'm sorry you're annoyed. I chose bytes here mainly out of ignorance and fear. This is an extremely low level protocol, and I just literally don't know how we can sanely convert environ values to Unicode without some loss of control or potential for incorrect decoding without having server encoding configuration. You say it's easy and straightforward, and that's fine. I just haven't internalized enough specification to know. I'd very much encourage folks who want to use native strings to create another PEP: it's just a lot easier to argue about one thing than it is to argue endlessly in snippets on blogs and epic maillist threads. I could care less if this *particular* PEP is selected, to be honest. Let's just get it over within a process where there's at least some chance of resolution. * It makes sense to me that the error stream should accept both bytes and unicode, and should do a best effort to handle either. Getting encoding errors or type errors when logging an error is very distracting. Sounds good. * Instead of focusing on Response(*response_tuple), I'd rather just rely on something like Response.from_wsgi(response_tuple). Body first feels very unnatural. Others have said same, also good. * Regarding long response headers, I think we should ignore the HTTP spec. You can put 4k in a Set-Cookie header, such headers aren't easily or safely folded... I think the line length constraint in the HTTP spec isn't a constraint we need to pay attention to. OK. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 2010-09-16 at 14:04 -0400, P.J. Eby wrote: At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote: No comments on the rest except to note that at this point it looks unlikely that we can make everyone happy (or even get an agreement to adopt what would be the long-term technically optimal solution -- AFAICT there is no agreement on what that solution would be, if one weren't to take porting Python 2 code into account). IOW something/sokebody has gotta give. Indeed. This entire discussion has pushed me strongly in favor of doing a super-minimalist update to PEP 333 with the following points: Right on, write it all down! ;-) - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Wed, 2010-09-15 at 20:05 -0400, P.J. Eby wrote: At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. The first thing I notice is that web3.async appears to force all existing middleware to delete it from the environment if it wishes to remain compatible, unless it adapts to support receiving callables itself. We can ditch everything concerning web3.async as far as I'm concerned. Ian has told me that this feature won't be liked by the async people anyway, as it doesnt have a trigger mechanism. On further reading I see you have something about middleware disabling itself if it doesn't support async execution, but this doesn't make any sense to me: if it can't support async execution, why wouldn't it just delete web3.async from the environ, forcing its wrapped app to be synchronous instead? I'm also not a fan of the bytes environ, or the new path_info/script_name variables; note that the spec's sample CGI implementation does not itself provide the new variables, and that middleware must be explicitly written to handle the case where there is duplication. I'm not concerned about which environment variables have it, but I would definitely like to be able to get at the original (non-%2F-decoded) path info somewhere. I'd be fine if PATH_INFO was just that, and get rid of web3.path_info. web3.script_name is probably just a mistake entirely. My main fear with this spec is that people will assume they can just make a few superficial changes to run WSGI code on it, when in fact it is deeply incompatible where middleware is concerned. In fact, AFAICT, it seems like it will be *harder* to write correct web3 middleware than it is to write correct WSGI middleware now. I'm very willing to drop web3.async entirely. It seems reasonable to do so. I should have done so before I mailed the spec, as I knew it would be unpopular. This seems like a step backward, since the whole idea behind dropping start_response() was to make correct middleware *easier* to write. Any time a spec makes something optional or allows More Than One Way To Do It, it immediately doubles the mimimum code required to implement that portion of the spec in compliant middleware. This spec has two optionalities: web3.async, and the optional path_info/script_name, so the return handling of every piece of middleware is doubled (or else environ['web3.async'] = False must be added at the top), and any code that modifies paths must similarly ditch the special variables or do double work to update them. No worries, let's get rid of both, with the caveat that it's pretty essential (to me anyway) to be able to get at the non-%2F-encoded path somewhere. The most sensible thing to me would be to put it in PATH_INFO. As far as bytes vs. strings, whatever, we have to pick one. Bytes makes more sense to me. I'll leave it to the native-string and/or unicode people to create their own spec. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] Add PEP 444, Python Web3 Interface.
It's, e.g. b'8080' .. instead of the integer value 8080. Apparently the type of this value was not spelled out sufficiently in the WSGI spec and string values and integer values were used interchangeably, making it harder to join them with the other values in the environ (a common thing to want to do). Bytes instances are attractive, as the rest of the values are also bytes, so they can be joined together easily. (I also redirected this to web-sig at the request of PJE). - C On Wed, 2010-09-15 at 17:02 -0700, John Nagle wrote: On 9/15/2010 4:44 PM, python-dev-requ...@python.org wrote: ``SERVER_PORT`` must be a bytes instance (not an integer). What's that supposed to mean? What goes in the bytes instance? A character string in some format? A long binary number? If the latter, with which byte ordering? What problem does this\ solve? John Nagle ___ Python-Dev mailing list python-...@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
On Fri, 2010-07-16 at 23:38 -0500, Ian Bicking wrote: On Fri, Jul 16, 2010 at 9:43 PM, Chris McDonough chr...@plope.com wrote: Nah, not nearly that hard: path_info = urllib.parse.unquote_to_bytes(environ['wsgi.raw_path_info']).decode('UTF-8') I don't see the problem? If you want to distinguish %2f from /, then you'll do it slightly differently, like: path_parts = [ urllib.parse.unquote_to_bytes(p).decode('UTF-8') for p in environ['wsgi.raw_path_info'].split('/')] This second recipe is impossible to do currently with WSGI. So... before jumping to conclusions, what's the hard part with using text? It's extremely hard to swallow Python 3's current disregard for the primacy of bytes at I/O boundaries. I'm trying, but I can't help but feel that the existence of an API like unquote_to_bytes is more symptom treatment than solution. Of course something that unquotes a URL segment unquotes it into bytes; it's the only sane default because URL segments found in URLs on the internet are bytes. Yes, URL quoted strings should decode to bytes, though arguably it is reasonable to also use the very reasonable UTF-8 default that urllib.parse.quote/unquote uses. So it's really just a question of names, should be quote_to_string or quote_to_bytes that name. Which honestly... whatever. After some careful consideration, I realize I'm only able to offer stop energy regarding the WSGI-as-text proposal, so I'll bow out of any maillist conversation about it for now. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
On Fri, 2010-07-16 at 17:11 -0500, Ian Bicking wrote: On Fri, Jul 16, 2010 at 5:08 PM, Chris McDonough chr...@plope.com wrote: On Fri, 2010-07-16 at 17:47 -0400, Tres Seaver wrote: In the past when we've gotten down to specifics, the only holdup has been SCRIPT_NAME/PATH_INFO, hence my suggestion to eliminate those. I think I favor PJE's suggestion: let WSGI deal only in bytes. I'd prefer that WSGI 2 was defined in terms of a bytes with benefits type (Python 2's ``str`` with an optional encoding attribute as a hint for cast to unicode str) instead of Python 3-style bytes. But if I had to make the Hobson's choice between Python 3 style bytes and Python 3 style str, I'd choose bytes. If I then needed to write middleware or applications, I'd use WebOb or an equivalent library to enable a policy which converted those bytes to strings on my behalf. Making it easy to write raw middleware or applications without using such a library doesn't seem as compelling a goal as being able to easily write one which allowed me direct control at the raw level. What are the concrete problems you envision with text request headers, text (URL-quoted) path, and text response status and headers? Documentation is the main reason. For example, the documentation for making sense of path_info segments in a WSGI that used unicodey-strings would, as I understand it, read something like this: The PATH_INFO environment variable is a string. To decode it, - First, split it on slashes:: segments = PATH_INFO.split('/') - Then turn each segment into bytes:: bytes_segments = [ bytes(x, encoding='latin-1') for x in segments ] - Then, de-encode each segment's urlencoded portions: urldecoded_segments = [ urllib.unquote(x) for x in bytes_segments ] - Then re-encode each urldecoded segment into the encoding expected by your application app_segments = [ str(x, encoding='utf-8') for x in urldecoded_segments ] .. note:: We decode from latin-1 above because WSGI tunnels the bytes representing the PATH_INFO by way of a string type which contains bytes as characters. That looks pretty apologetic to me, and to be honest, I'm not even sure it will work reliably in the face of existing/legacy applications which have emitted URLs that are not url-encoded properly if those old URLs need to be supported. http://bugs.python.org/issue8136 contains a variation on this theme. I'd much rather say be able to say: The PATH_INFO environment variable is a ``bytes-with-benefits`` type. To decode it: - First, split it on slashes:: segments = PATH_INFO.split('/') - Then, de-encode each segment's urlencoded portions: urldecoded_segments = [ urllib.unquote(x) for x in segments ] - Then re-encode each urldecoded segment into the encoding expected by your application app_segments = [ str(x, encoding='utf-8') for x in urldecoded_segments ] Let me know if I'm missing something. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
On Sat, 2010-07-17 at 01:33 +0200, Armin Ronacher wrote: Hi, On 7/17/10 1:20 AM, Chris McDonough wrote: Let me know if I'm missing something. The only thing you miss is that the bytes type of Python 3 is badly supported in the stdlib (not an issue if we reimplement everything in our libraries, not an issue for me) and that the bytes type has no string formattings which makes us do the encode/decode dance in our own implementation so of the missing stdlib functions. This is why the docs mention bytes with benefits instead (like the Python 2 str type). The existence of such a type would be the result of us lobbying for its inclusion into some future Python 3, or at least the result of lobbying for a String ABC that would allow us to define our own. But.. yeah. Stdlib support for bytes. Dunno. What I really don't want to do is implement a WSGI spec in terms of Unicodey strings just because the webby stuff in the stdlib cannot deal with bytes. Those stdlib implementations should be changed to deal with bytes-ish things instead. I actually think fixing the stdlib will end up being a driver for the bytes with benefits type. Supporting such a type in the implementation of stdlib functions is clearly the right way to fix it in lots of cases, because they will be able to deal with BwB and Unicodey-strings in exactly the same way. In the meantime, I think using bytes is the only sane thing to do in some interim specification, because moving from a spec which is bytes-oriented to a spec that is text-oriented now will leave us in the embarrassing position of needing to create yet another bytes-oriented spec later (as, well, I/O is bytes), when Python 3 matures and realizes it needs such a hybrid type. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] http://wiki.python.org/moin/WebFrameworks
http://wiki.python.org/moin/WebFrameworks seems to be the place where folks are registering their respective web frameworks. I'd like to move some of the frameworks which are currently in the various categories which haven't been active in a few years. In particular, I'd like to move any framework which hasn't had a release since the beginning of 2008 (arbitrary) into the Discontinued / Inactive framework category. I'd be willing to do the work to make sure I wasn't moving one that actually *did* have releases past that but just hadn't updated the page. Any dissent? - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Paste] WebOb API
Ian Bicking wrote: Also I'm planning on introducing a BaseRequest (and *maybe* BaseResponse) class, that removes some functionality. Specifically for Repoze they'd like to remove __getattr__ and __setattr__ (which has some performance implications), FTR, after thinking about it, I'm not even sure BaseRequest is necessary for this purpose. This seems to work too (at least it gets previously visible setattr/getattr stuff out of the profiling info): class Request(WebobRequest): __setattr__ = object.__setattr__ __getattr__ = object.__getattribute__ __delattr__ = object.__delattr__ ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Session events
This is supported at least here: http://docs.repoze.org/session/usage.html#using-begin-and-end-subscribers Alastair Bell Turner wrote: Hi I've been looking through the range of choices for Python web [application] frameworks/libraries (Just to have all the bases covered) for a new build project and standardisation of some small utilities. There's one feature that I'm not finding and was just wanting to check on before considering the joys of rolling my own: I'm not finding any support for user session events, I'm particularly interested in being able to register a handler on session expiry or cleanup. I've mainly been looking at the lighter weight frameworks since my requirement for the new build is mainly aggregate and list operations, so the least suitable load for ORMs. Have I missed the feature session event somewhere? Thanks Alastair Bell Turner ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Request for Comments on upcoming WSGI Changes
OK, after some consideration, I think I'm sold. Answering my own original question about why unicode seems to make sense as values in the WSGI environment even without consideration for Python 3 compatibility: *something* needs to do this translation. Currently I personally rely on WebOb to do a lot of this translation. I can't think of a good reason that implementations at the level of WebOb would each need to do this translation work; pushing the job into WSGI itself seems to make sense here. This is particularly true for PATH_INFO and QUERY_STRING; these days it's foolish to assume these values will be entirely composed of low order characters, and thus being able to access them as bytes natively isn't very useful. OTOH, I suspect the Python 3 stdlib is still broken if it requires native strings in various places (and prohibits the use of bytes). James Bennett wrote: On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough chr...@plope.com wrote: WSGI is a fairly low-level protocol aimed at folks who need to interface a server to the outside world. The outside world (by its nature) talks bytes. I fear that any implied conversion of environment values and iterable return values to Unicode will actually eventually make things harder than they are now. I realize that it would make middleware implementors lives harder to need to deal in bytes. However, at this point, I also believe that middleware kinda should be hard. We have way too much middleware that shouldn't be middleware these days (some written by myself). Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an interface to HTTP should deal in bytes as well. The problem, really is that despite being a very low-level interface, WSGI has a tendency to leak up into much higher-level code, and (IMO) authors of that high-level code really shouldn't have to waste their time dealing with details of the underlying low-level gateway. You've said you don't want to hear Python 3 as the reason, but it provides some useful examples: in high-level code you'll commonly want to be doing things like, say, comparing parts of the requested URL path to known strings or patterns. And that high-level code will almost certainly use strings, while WSGI, in theory, will be using bytes. That's just a recipe for disaster; if WSGI mandates bytes, then bytes will have to start infecting much higher-level code (since Python 3 -- rightly -- doesn't let you be nearly as promiscuous about mixing bytes and strings). Once I'm at a point where I can use Python 3, I know I'll personally be looking for some library which will normalize everything for me before I interact with it, precisely to avoid this sort of leakage; if WSGI itself would at least *allow* that normalization to happen at the low level (mandating it is another discussion entirely) I'd feel much happier about it going forward. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Request for Comments on upcoming WSGI Changes
I'll try to digest some of this, currently I'm pretty clueless. Personally, I find it a bit hard to get excited about Python 3 as a web application deployment platform. This is of course a personal judgment (I don't mean to slight Python 3) but at this point, I'll think I'll probably be writing software that targets 2.X exclusively for at least the next five years. Given this point of view, it would be extremely helpful if someone could explain to people with the same outlook why we should want to deal with Unicode strings in any WSGI specification. WSGI is a fairly low-level protocol aimed at folks who need to interface a server to the outside world. The outside world (by its nature) talks bytes. I fear that any implied conversion of environment values and iterable return values to Unicode will actually eventually make things harder than they are now. I realize that it would make middleware implementors lives harder to need to deal in bytes. However, at this point, I also believe that middleware kinda should be hard. We have way too much middleware that shouldn't be middleware these days (some written by myself). Anyway, for us slower (and maybe wrongly fearful) folks, could someone summarize the benefits of having a WSGI specification that requires Unicode. Bonus points for an explanation that does not boil down to it will be compatible with Python 3. - C Armin Ronacher wrote: Hello everybody, Thanks to Graham Dumpleton and Robert Brewer there is some serious progress on WSGI currently. I proposed a roadmap with some PEP changes now that need some input. Summary: WSGI 1.0 stays the same as PEP 0333 currently is WSGI 1.1 becomes what Ian and I added to PEP 0333 WSGI 2.0 becomes a unicode powered version of WSGI 1.1 WSGI 3.0 becomes WSGI 2.0 just without start_response WSGI 1.0 and 1.1 are byte based and nearly impossible to use on Python 3 because of changes in the standard library that no longer work with a byte-only approach. The PEPs themselves are here: http://bitbucket.org/ianb/wsgi-peps/ Neither the wording not the changes in there are anywhere near final. Graham wrote down two questions he wants every major framework developer to be answered. These should guide the way to new WSGI standards: 1. Do we keep bytes everywhere forever in Python 2.X, or try to introduce unicode there at all to at least mirror what changes might be made to make WSGI workable in Python 3.X? 2. Do we skip WSGI 1.X completely for Python 3.X and go straight to WSGI 2.0 for Python 3.X? I added a new question I think should be asked too: 3. Do we skip WSGI 2.0 as specified in the PEP and go straight to WSGI 3.0 and drop start_response? The following things became pretty clear when playing around with various specifications on Python 3: - Python 3 no longer implicitly converts between unicode and byte strings. This covers comparisons, the regular expression engine, all string functions and many modules in the stdlib. - The Python 3 stdlib radically moved to unicode for non unicode things as well (the http servers, http clients, url handling etc.) - A byte only version of WSGI appears unrealistic on Python 3 because it would require server and middleware implementors to reimplement parts of the standard library to work on bytes again. - unicode support can be added for WSGI on both Python 2.x and Python 3.x without removing functionality. Browsers are already doing a similar encoding trick as proposed by Graham Dumpleton to handle URLs. - Python 2.x already accepts unicode strings for many things such as URL handling thanks to the fact that unicode and byte strings are surprisingly interchangeable. - cgi.FieldStorage and some other parts is now totally broken on Python 3 and should no longer be used in 3.0 and 3.1 because it reads the response body into memory. This currently affects WebOb, Pylons and TurboGears. I sent this mail to every major framework / WSGI implementor so that we get input even if you're missing the discussion on web-sig. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] repoze.bfg web framework 1.0 released
On 7/5/09 10:37 PM, Graham Dumpleton wrote: The first major release of the BFG web framework (aka repoze.bfg), version 1.0, is available. See http://bfg.repoze.org/ for general information about repoze.bfg. ... - WSGI-based deployment: PasteDeploy and mod_wsgi compatible. ... - A comprehensive set of unit tests. The repoze.bfg package contains 11K lines of Python code. 8000 lines of that total line count is unit test code that tests the remaining 3000 lines. A question about your testing if you have time. Is this done in a fake WSGI hosting environment, ie., test harness, or is it able to be run through WSGI servers such as Paste server, Apache/mod_wsgi, etc, in some way? The tests I mentioned in there are mostly unit tests; they don't test any particular system configuration functionally. In particular, none of the tests actually invokes a request via a WSGI stack. But we do use functional testing in projects that use the framework. For example, we use Twill (created by Titus Brown) to make sure things don't break at the request/response level in this project: http://karlproject.org. Am curious from the point of view that standalone test suites for WSGI itself to run against WSGI hosting mechanisms don't really exist, so the test suite for BFG, with the presumption that it would exercise a lot of WSGI functionality, might be a good regression test for WSGI servers themselves. I think maybe some ACID test WSGI application could be built, and then some set of functional HTTP-level tests could be run against that application to gain confidence in a WSGI app. This is more or less what we do with Twill on that KARL project: the developers use the Paste#http server, but we actually deploy to a mod_wsgi server. We can (and do) run the Twill tests against both to get confidence that the app isn't going to fall over in production. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] repoze.bfg web framework 1.0 released
On 7/5/09 11:44 PM, Randy Syring wrote: Chris, Sounds interesting. Question: Does it support some kind of module/plugin architecture that will allow me to develop plug- in functionality across projects? What would be called in Django an app. For example, I would like to have a news, blog, and calendar module that I can plug into different applications. The goal is to have everything for the module contained in one subdirectory or package including any configuration, routing, templates, controllers, model, etc. So, something like this: /modules/news/... /modules/calendar/... /modules/blog/... Or: packages/ MyProj NewsComponent CalendarComponent BlogComponent I'm not sure if I can do this topic justice here (many have fallen on the sword when approaching it before), but I'll try. Plugin apps is maybe less a feature of BFG than the stuff that BFG is built on top of. Like Zope, BFG makes use of the Zope Component Architecture under the hood. Unlike Zope, BFG tends to hide the ZCA (conceptually and API-wise) from developers, because the ZCA introduces concepts like adapters, interfaces, and utilities. Direct exposure to these concepts in user-visible code evokes suspicion in people who just don't have the problems they try to solve. The problems that the ZCA tries to solve usually revolve around code testability and reusability, and most people just don't care that much about these things. So BFG is more like Pylons or Django in this respect: it provides helper APIs and places to hang your code so that you can build a single-purpose application reasonably easily without making you think in terms of building anything reusable. The final application usually happens to be overrideable and extensible, but that's just a byproduct of using BFG, and doesn't really have very much to do with building a system out of plugins. In the meantime, the Zope Component Architecture is a fantastic system on which to build a *framework* (as opposed to an application). This is why BFG is built on top of it. If you are willing to use the ZCA conceptually and API-wise *in your application code*, it becomes straightforward to build reusable applications like you mention. So the answer to your original question is probably no. BFG itself isn't a system which allows you to slot arbitrary components into place and have them show up somewhere. It's instead a system (like Zope) in which you can build such a thing. In fact, many of the applications that we (my company, Agendaless) build are these kinds of applications, where we tend to want to reuse a single application component across many customers or projects. The trick is this: when you build pluggable applications, there's presumably something you're going to want to plug these applications into. I *think* this the piece that most people are after when they talk about pluggable applications; they actually don't care too much about the applications themselves (because they'll build them themselves), it's the higher-level thing that gets plugged into that is of primary interest. For better or worse, systems like Plone, Drupal, and Joomla are examples of such an application framework. These systems allow you to build small pieces of functionality that drop in to some larger system. We've done lots of Zope and Plone work, and we know the downsides of the plug this bit into the larger framework pattern pretty well. We've found that it's useful to have the tools at hand to build miniature versions of such large frameworks on hand, so we can quickly come up with a custom solution to some problem without fighting the framework (any particular framework) so much. BFG plus direct use of the ZCA in application code tends to let us avoid using the larger frameworks in favor of rolling our own (more focused, simpler) frameworks. Unfortunately, I don't have any simple example application code to show with respect to this pattern, because anything I could show here would be too trivial to be useful. More unfortunately, anything I can point you to that we've built using this pattern will probably be too large to understand in any reasonable amount of time (e.g. http://karlproject.org). This has always been the historical problem with trying to promote use of the ZCA for application code: until you work on a larger project that uses it right, it's just too abstract. So by the time you actually need it, it's too late and you've already invented your own mechanisms to do similar indirections. For those reasons, I think it would be a useful exercise to build some very simple system that took app plugins and just exposed them in some very concrete way to end users, even if it meant losing some presentation flexibility. Such a system could be created in any web framework, but using the ZCA inside the web framework for such a task is a no-brainer to me. Anyway, even this explanation is too
[Web-SIG] repoze.bfg web framework 1.0 released
Summary --- The first major release of the BFG web framework (aka repoze.bfg), version 1.0, is available. See http://bfg.repoze.org/ for general information about repoze.bfg. Details --- BFG is a Python web framework based on WSGI. It is inspired by Zope, Pylons, and Django. It makes use of a number of Zope technologies under the hood. BFG is developed as part of the more general Repoze project (http://repoze.org). It is released under the BSD-like license available from http://repoze.org/license.html . BFG version 1.0 represents one year of development effort. The first release of BFG, version 0.1, was made in July of 2008. Since then, roughly 80 pre-1.0 releases have been made. None of these pre-1.0 releases explicitly promised any backwards compatibility with any earlier release. Version 1.0, however, marks the first point at which the repoze.bfg API has been frozen. Future releases in the 1.X line guarantee API-level backward compatibility with 1.0. A backwards incompatibility with 1.0 at the API level in any future 1.X version will be considered a bug. More Details BFG contains moderate, incremental improvements to patterns found in earlier-generation web frameworks. It tries to make real-world web application development and deployment more fun, more predictable, and more productive. To this end, BFG has the the following features: - WSGI-based deployment: PasteDeploy and mod_wsgi compatible. - Runs under Python 2.4, 2.5, and 2.6. - Runs on UNIX, Windows, and Google App Engine. - Full documentation coverage: no feature or API is undocumented. - A comprehensive set of unit tests. The repoze.bfg package contains 11K lines of Python code. 8000 lines of that total line count is unit test code that tests the remaining 3000 lines. - Sparse resource utilization: BFG has a small memory footprint and doesn't waste any CPU cycles. - Doesn't have an unreasonable set of dependencies: easy_install -ing repoze.bfg over broadband takes less than a minute. - Quick startup: a typical BFG application starts up in about a second. - Offers extremely fast XML/HTML and text templating via Chameleon (http://chameleon.repoze.org/). - Persistence-agnostic: use SQLAlchemy, raw SQL, ZODB, CouchDB, filesystem files, LDAP, or anything else which suits a particular application's needs. - Provides a variety of starter project templates. Each template makes it possible to quickly start developing a BFG application using a particular application stack. - Offers URL-to-code mapping like Django or Pylons' *URL routing* or like Zope's *graph traversal*, or allows a combination of both routing and traversal. This helps make it feel familiar to both Zope and Pylons developers. - Offers debugging modes for common development error conditions (for example, when a view cannot be found, or when authorization is being inappropriately granted or denied). - Allows developers to organize their code however they see fit; the framework is not opinionated about code structure. - Allows developers to write code that is easily unit-testable. Avoids using thread local data structures which hamper testability. Provides helper APIs which make it easy to mock framework components such as templates and views. - Provides an optional declarative context-sensitive authorization system. This system prevents or allows the execution of code based on a comparison of credentials possessed by the requestor against ACL information stored by a BFG application. - Behavior of an an application built using BFG can be extended or overridden arbitrarily by a third-party developer without any modification to the original application's source code. This makes BFG a good choice for building frameworks and other extensible applications. - Zope and Plone developers will be comfortable with the terminology and concepts used by BFG; they are almost all Zope-derived. Excruciating Details Quick installation: easy_install -i http://dist.repoze.org/bfg/current repoze.bfg General support and information: http://bfg.repoze.org Tutorials http://docs.repoze.org/bfg/current/#tutorials Sample Applications http://docs.repoze.org/bfg/current/#sample-applications Detailed narrative and API documentation: http://docs.repoze.org/bfg/current Bug tracker: http://bfg.repoze.org/trac Maillist: http://lists.repoze.org/listinfo/repoze-dev IRC support: irc://irc.freenode.net#repoze repoze.bfg is developed primarily by Agendaless Consulting (http://agendaless.com) and a team of contributors. Special thanks to these people, without whom this release would not have been possible: Malthe Borch, Carlos de la Guardia, Chris Rossi, Shane Hathaway, Tom Moroz, Yalan Teng, Jason Lantz, Todd Koym, Jessica Geist, Hanno Schlichting, Reed O'Brien, Sebastien Douche, Ian Bicking, Jim Fulton, Martijn Faassen, Ben Bangert, Fernando Correa
Re: [Web-SIG] Prototype of wsgi.input.readline().
Graham Dumpleton wrote: As I think we all know, no one implements readline() for wsgi.input as defined in the WSGI specification. The reason for this is that stuff like cgi.FieldStorage would refuse to work and would just generate an exception. This is because cgi.FieldStorage expects to pass an argument to readline(). I haven't been keeping up on the issues this has caused wrt WSGI, but note that the reason that cgi.FieldStorage passes a size argument to readline is in order to prevent memory exhaustion when reading files that don't have any linebreaks (denial of service). See http://bugs.python.org/issue1112549 . So, although this is linked in the issues list for possible amendments to WSGI specification, there hasn't that I recall been a discussion on how readline() would be defined in any amendment or future version. In particular, would the specification be changed to either: 1. readline(size) where size argument is mandatory, or: 2. readline(size=-1) where size argument is optional. If the size argument is made mandatory, then it would parallel how read() function is defined, but this in itself would mean cgi.FieldStorage would break. This is because cgi.FieldStorage actually calls readline() with no argument as well as an argument in different places in the code. cgi.FieldStorage doesn't call readline() without an argument. cgi.parse_multipart does, but this function is not used by cgi.FieldStorage. I don't know if this changes anything. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Prototype of wsgi.input.readline().
Graham Dumpleton wrote: If the size argument is made mandatory, then it would parallel how read() function is defined, but this in itself would mean cgi.FieldStorage would break. This is because cgi.FieldStorage actually calls readline() with no argument as well as an argument in different places in the code. cgi.FieldStorage doesn't call readline() without an argument. cgi.parse_multipart does, but this function is not used by cgi.FieldStorage. I don't know if this changes anything. Not really, I should have said 'cgi' module as a whole rather than specifically cgi.FieldStorage. Given that people might be using cgi.parse_multipart in standard CGI, there would probably still be an expectation that it worked for WSGI. We can't really say that you can use cgi.FieldStorage but not cgi.parse_multipart. People will just expect all the normal tools people would use for this to work. Personally, I think parse_multipart should go away. It's not suitable for anything but toy usage. If people use it, and they expose their site to the world, arbitrary anonymous visitors can cause their Python's process size to grow to arbitrarily. I don't think any existing well-known framework uses it, for this very reason. If it can't go away, and there's a problem due to the non-parity between parse_multipart's use and FieldStorage's use, I suspect the right answer is to change cgi.parse_multipart to pass in a size value for readline too. I probably should have done that when I made the patch. :-( - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Prototype of wsgi.input.readline().
Graham Dumpleton wrote: On 31/01/2008, Chris McDonough [EMAIL PROTECTED] wrote: Graham Dumpleton wrote: If the size argument is made mandatory, then it would parallel how read() function is defined, but this in itself would mean cgi.FieldStorage would break. This is because cgi.FieldStorage actually calls readline() with no argument as well as an argument in different places in the code. cgi.FieldStorage doesn't call readline() without an argument. cgi.parse_multipart does, but this function is not used by cgi.FieldStorage. I don't know if this changes anything. Not really, I should have said 'cgi' module as a whole rather than specifically cgi.FieldStorage. Given that people might be using cgi.parse_multipart in standard CGI, there would probably still be an expectation that it worked for WSGI. We can't really say that you can use cgi.FieldStorage but not cgi.parse_multipart. People will just expect all the normal tools people would use for this to work. Personally, I think parse_multipart should go away. It's not suitable for anything but toy usage. Not necessarily. Someone may see it as a trade off. The code itself says: This is easy to use but not much good if you are expecting megabytes to be uploaded -- in that case, use the FieldStorage class instead which is much more flexible. So comment implies it is easier to use and so some may think it is simpler for what they are doing if they are only dealing with small requests. Of course, it would probably be prudent if you know your requests are always going to be small to use LimitRequestBody in Apache, or a specific check on content length if handled in Python code, to block someone sending over sized requests intentionally to try and break things. Provided you did this, may be quite reasonable to use it in specific circumstances. Indeed. But then again, I doubt the casual user would be able to make this judgment and take the necessary precautions. This kind of user is likely the same class of user for whom CGI.FieldStorage is too hard (which it really isn't). - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] HEAD requests, WSGI gateways, and middleware
I have applications that do detect the difference between a GET and a HEAD (they do slightly less work if the request is a HEAD request), so I suspect this is not a totally reasonable thing to add to the spec. Maybe instead the middleware that does what you're describing should be changed instead to deal with HEAD requests. In general, I don't think is (or should be) any guarantee that an arbitrary middleware stack will work with an arbitrary application. Although that would be nice in theory, I suspect it would require a very complex protocol (more complex than what WSGI requires now). - C Brian Smith wrote: My application correctly responds to HEAD requests as-is. However, it doesn't work with middleware that sets headers based on the content of the response body. For example, a gateway or middleware that sets ETag based on an checksum, Content-Encoding, Content-Length and/or Content-MD5 will all result in wrong results by default. Right now, my applications assume that any such gateway or the first such middleware will change environ[REQUEST_METHOD] from HEAD to GET before the application is invoked, and discard the response body that the application generates. However, many gateways and middleware do not do this, and PEP 333 doesn't have anything to say about it. As a result, a 100% WSGI 1.0-compliant application is not portable between gateways. I suggest that a revision of PEP 333 should require the following behavior: 1. WSGI gateways must always set environ[REQUEST_METHOD] to GET for HEAD requests. Middleware and applications will not be able to detect the difference between GET and HEAD requests. 2. For a HEAD request, A WSGI gateway must not iterate through the response iterable, but it must call the response iterable's close() method, if any. It must not send any output that was written via start_response(...).write() either. Consequently, WSGI applications must work correctly, and must not leak resources, when their output is not iterated; an application should not signal or log an error if the iterable's close() method is invoked without any iteration taking place. Please add this issue to http://wsgi.org/wsgi/WSGI_2.0. Regards, Brian ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [extension] x-wsgiorg.flush
On Oct 4, 2007, at 11:55 AM, Phillip J. Eby wrote: At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote: Your are making a critical decision here. You are lowering the level of WSGI to match the level of average WSGI middlewares programmers. No, we're just getting rid of legacy cruft that's hard to support correctly. There's a big difference. Getting the start_response dance down and understanding how it plays with middleware is *hard*. Even if we called it something other than WSGI 2.0 (which I don't think we should, because it really is an evolution), returning the three-tuple is the right thing to do. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Web Site Process Bus
On Jun 26, 2007, at 1:04 AM, Graham Dumpleton wrote: In Apache changing the certificates would need a complete restart of everything. Because the child processes aren't privileged they would not be able to trigger the main server to do so. This actually gets to one of my reservations about some of the stuff being discussed. That is, that the WSGI applications should even have any ability to control the underlying web server. In a shared web hosting environment using Apache, allowing such control is not practical as you don't want arbitrary user doing things to the server. If you are running Apache as a dedicated server for a single application that is a different matter however. Thus some aspects of what can be done by via the bus would have to be controllable dependent on the environment in which one is running. At least with Apache, even initiating this sort of stuff from inside of a WSGI application may not make a great deal of sense even then. It would be far easier and preferable in Apache to use a suexec CGI script to accept the upload of the SSL certificate and then trigger a restart of Apache. So in the end the bus concept may be great for pure Python system, but not so sure about a complicated mixed code system like Apache, especially where there may be better ways of handling it through other features of Apache. There are also non-webbish processes like postgres, mysql, etc. that need to be treated as part of the application. I handle this currently by running all of the processes related to a specific project under a process controller (which happens to be implemented in Python, but that's besides the point, see http:// www.plope.com/software/supervisor2/). The process controller is responsible for execing the child processes upon its own startup. It is also responsible for restarting children if they die, capturing their output (if any), and allowing sufficiently privileged users to start and stop each one independently. The only promise a subprocess must make to be managed is that it must be possible to start the process in the foreground (not under its down daemon manager). If a process bus is implemented I suspect it should be implemented at this kind of level. Actions could be registered for a specific subprocess types to send some input to a pipe file descriptor, send a signal to the process, etc. It would also be possible to create some sort of dependency map between processes in a configuration, that relate the actions of one process to another (restart process A if process B is restarted, send a signal S to process C if signal T is sent to process D, etc). - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Web Site Process Bus
On Jun 26, 2007, at 5:07 PM, Robert Brewer wrote: I think I'm mostly confused by the name process bus because it seems like the primary use case for something like this is where all of the applications share the same process space I don't see why it should be limited by that. The primary use case is anywhere site components and application components are interacting, that could benefit from a shared understanding (and control) of the state of the site. To me, that requires a common set of messages, but the transport mechanism for those messages should be flexible so that it's useful in both multithread and multiprocess architectures. Thank you. I see. This is a little too abstract for me to get my brain around, but I'll continue listening and maybe I'll get religion. ;-) - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] html dom like javascript ?
You probably want elementtree (http://effbot.org/zone/element-index.htm). Thanks for the rapid reply. I am familiar with a number of these and have searched the web documentation but for the most part these appear to be parsers or things like: http://www.acooke.org/andrew/writing/python-xml.html#code That are xml centric and not html related. I'm looking for something that is more html specific that contains all the options for any html widtget, like a form element with all of its options like style, css, and so forth. In other words I dont want to have to write my own xml file with all the html tags and options. Jean-Paul Calderone wrote: On Thu, 17 Aug 2006 10:10:47 -0400, seth [EMAIL PROTECTED] wrote: Is there a python library which is analogous to javascript for creating html/xhtml documents? e.g.: hidden = document.createElement(input) hidden.setAttribute(type, hidden) hidden.setAttribute(name, active_flag_hidden_ + ctl) if( dirtyArray[ctl].checked == true) { hidden.setAttribute(value, 'N') } else { hidden.setAttribute(value, 'Y') } document.forms['listForm'].appendChild(hidden) At least fifty. The DOM API is heavily standardized with hundreds of implementations in dozens of languages. http://python.org/doc/lib/module-xml.dom.html Jean-Paul ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI in standard library
On Feb 12, 2006, at 6:39 AM, Alan Kennedy wrote: So, I still think that only basic servers educational/playpen servers should go in the standard library, with an indication that the user should pick an openly server from outside the distro if they require to do serious server work. I agree 100%. Maybe if there were no production-ready servers in the standard library, there would be no need for a Python Security Response Team. As an example, it's currently possible to perform denial of service on any framework/server that uses the cgi.FieldStorage module. See http://sourceforge.net/tracker/? func=detailaid=1112549group_id=5470atid=105470 . That module probably doesn't belong in the stdlib in the first place, but it's in there, and now things depend on it. In the meantime, this patch *really* should have been applied by now but hasn't been. If anyone has checkin access, or can help me poke the appropriate person, it would help... this was reported to the SRT at the time. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] My original template API proposal
Although I've been trying to follow this thread, I'm finding it difficult to get a handle on what is meant to *call* the template API (e.g. what typically calls render in Ian's ITemplatePlugin interface at http://svn.pythonpaste.org/home/ianb/templateapi/ interface.py)? Is the framework meant to call render? Sorry for the remedial question ;-) - C On Feb 5, 2006, at 5:19 PM, Phillip J. Eby wrote: At 02:46 PM 2/5/2006 -0600, Ian Bicking wrote: Ian Bicking wrote: def render(template_instance, vars, format=html, fragment=False): Here I can magically turn this into a WEB templating spec: def render(template_instance, vars, format=html, fragment=False, wsgi_environ=None, set_header_callback=None) wsgi_environ is the environ dictionary if this is being called in a WSGI context. set_header_callback can be called like set_header_callback(header_name, header_value) to write such a header to the response. Frameworks may or may not allow for setting headers. If they don't allow for it, they shouldn't provide that callback (thus headers will not be mysteriously thrown away -- instead they will be rejected immediately). [Should set_header_callback('Status', '404 Not Found') be used, or a separate callback, or...?] This follows what all server pages templates I know of do. That is, they do not have special syntax related to any metadata (i.e., headers) or even any special syntax related to web requests. Instead the web request is represented through some set of variables available in the template. Yes, but different template systems offer different APIs based on it; the idea of using WSGI here was to make it possible for them to offer their *own*, native APIs under this spec, not to force the use of the host framework's API. The only thing that's missing from your proposal is streaming control or large file support. I'll agree that it's an edge use case, but it seems to me just as easy to just offer a plain WSGI interface and not have to document a bunch of differences and limitations. OTOH, if this is what it takes to get consensus, so be it. The additional advantage to using plain ol' WSGI as the calling interface, however, is that it also lets you embed *anything* as a template, including whole applications if they provide a template engine whose syntax is actually the application's configuration. Anyway, the only differences I'm aware of between what you're proposing and what I'm proposing are: 1. Syntax sugar (each proposal sweetens different use cases) 2. Feature restrictions (yours takes away streaming) 3. What's optional (you consider WSGI optional, I want strings to be optional) It would be better, I think, to address further discussion to addressing the actual points of difference. Regarding #2, I'm willing to compromise to get consensus. Regarding #3, I'd be willing to compromise by making *both* optional, with clearly defined variations of the spec so that plugins and frameworks that support each are clearly distinguishable. This would also mean that we'd both be able to get the syntaxes we want under #1. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism% 40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Standardized template API
One specific concern about the returning the published object for publisher-based frameworks is that often the published object has references to other objects that might not make sense in the context of the thread handling the rendering of the template. For example, if you're using a thread pool behind a Twisted server, and the thing doing the rendering is in the main thread, methods hanging off of the published object might try to make use of thread-local storage, which would fail. Zope 3 uses thread-local storage for request objects, IIRC. This might be a nonissue, because I'm a little fuzzy on which component(s) actually do(es) the rendering of the template in the models being proposed. But the amount of fuzziness I have about what's trying to be specified here makes me wonder if there aren't better things to go specify. As I mentioned in my counter-proposal, there should probably be a key like 'wti.source' to contain either the object to be published (for publisher-oriented frameworks) or a dictionary of variables (for controller-oriented frameworks). I originally called it published object, but that's biased towards publisher frameworks so perhaps a more neutral name like 'source' or 'data' would be more appropriate. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] transaction progress with cgi.FieldStorage
An aside on cgi.FieldStorage itself. It reads data using readline instead of reading in blocks of limited size. doing this I think means a file with very long lines, 20MB, 100MB, ... could cause excessive memory consumption. This was reported and solved a long time ago (but not yet fixed in any Python distro): https://sourceforge.net/tracker/? func=detailaid=1112549group_id=5470atid=105470 ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI deployment use case
On Tue, 2005-07-26 at 01:18 -0500, Ian Bicking wrote: Well, the stack is really just an example, meant to be more realistic than sample1 and sample2. I actually think it's a very reasonable example, but that's not really the point. Presuming this stack, how would you configure it? I typically roll out software to clients using a build mechanism (I happens to use pymake at http://www.plope.com/software/pymake/ but anything dependency-based works). I write generic build scripts for all of the software components. For example, I might write makefiles that check out and build python, openldap, mysql and so on (each into a non-system location). I leave a bit of room for customization in their build definitions that I can override from within a profile. A profile is a set of customized software builds for a specific purpose. I might have, maybe, 3 different profiles for each customer where the profile usually works out to be tied to machine function (load balancer, app server, database server). I mantain these build scripts and the profiles in CVS for each customer. I never install anything by hand, I always change the buildout and rerun it if I need to get something set up. This usually works out pretty well because to roll out a new major version of software, I just rerun the build scripts for a particular profile and move the data over. Usually the only thing that needs to change frequently are a few bits of software that are checked out of version control, so doing cvs up on those bits typically gets me where I need to be unless it's a major revision. So in this case, I'd likely write a build that either built Apache from source or at least created an httpd-includes file meant to be referenced from within the system Apache config file with the proper stuff in it given the profile's purpose. The build would also download and install Python, it would get the the proper eggs and/or Python software and the database, and so forth. All the configuration would be done via the profile which is in version control. I don't know if this kind of thing works for everybody, but it has worked well for me so far. I do this all the time, and I have a good library of buildout scripts already so it's less painful for me than it might be for someone who is starting from scratch. That said, it is time-consuming and imperfect... upgrades are the most painful. New installs are simple, though. So, anyway, the short answer is I write a script to do the config for me so I can repeat it on demand. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config
Thanks... I'm still confused about high level requirements so please try to be patient with me as I try get back on track. These are the requirements as I understand them: 1. We want to be able to distribute WSGI applications and middleware (presumably in a format supported by setuptools). 3. We want to be able to configure a WSGI application in order to create an application instance. 2. We want a way to combine configured instances of those applications into pipelines and start an instance of a pipeline. Are these requirements the ones being discussed? If so, which of the config file formats we've been discussing matches which requirement? Thanks, - C On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote: At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote: Sorry, I think I may have lost track of where we were going wrt the deployment spec. Specifically, I don't know how we got to using eggs (which I'd really like to, BTW, they're awesome conceptually!) from where we were in the discussion about configuring a WSGI pipeline. What is a feature? What is an import map? Entry point? Should I just get more familiar with eggs to understand what's being discussed here or did I miss a few posts? I suggest this post as the shortest architectural introduction to the whole egg thang: http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html It explains pretty much all of the terminology I'm currently using, except for the new terms invented today... Entry points are a new concept, invented today by Ian and myself. Ian proposed having a mapping file (which I dubbed an import map) included in an egg's metadata, and then referring to named entries from a pipeline descriptor, so that you don't have to know or care about the exact name to import. The application or middleware factory name would be looked up in the egg's import map in order to find the actual factory object. I took Ian's proposal and did two things: 1) Generalized the idea to a concept of entry points. An entry point is a name that corresponds to an import specification, and an optional list of extras (see terminology link above) that the entry point may require. Entry point names exist in a namespace called an entry point group, and I implied that the WSGI deployment spec would define two such groups: wsgi.applications and wsgi.middleware, but a vast number of other possibilities for entry points and groups exist. In fact, I went ahead and implemented them in setuptools today, and realized I could use them to register setup commands with setuptools, making it extensible by any project that registers entry points in a 'distutils.commands' group. 2) I then proposed that we extend our deployment descriptor (.wsgi file) syntax so that you can do things like: [foo from SomeProject] # configuration here What this does is tell the WSGI deployment API to look up the foo entry point in either the wsgi.middleware or wsgi.applications entry point group for the named project, according to whether it's the last item in the .wsgi file. It then invokes the factory as before, with the configuration values as keyword arguments. This proposal is of course an *extension*; it should still be possible to use regular dotted names as section headings, if you haven't yet drunk the setuptools kool-aid. But, it makes for interesting possibilities because we could now have a tool that reads a WSGI deployment descriptor and runs easy_install to find and download the right projects. So, you could potentially just write up a descriptor that lists what you want and the server could install it, although I think I personally would want to run a tool explicitly; maybe I'll eventually add a --wsgi=FILENAME option to EasyInstall that would tell it to find out what to install from a WSGI deployment descriptor. That would actually be pretty cool, when you realize it means that all you have to do to get an app deployed across a bunch of web servers is to copy the deployment descriptor and tell 'em to install stuff. You can always create an NFS-mounted cache directory where you put pre-built eggs, and EasyInstall would just fetch and extract them in that case. Whew. Almost makes me wish I was back in my web apps shop, where this kind of thing would've been *really* useful to have. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config
Actually, let me give this a shot. We package up an egg called helloworld.egg. It happens to contain something that can be used as a WSGI component. Let's say it's a WSGI application that always returns 'Hello World'. And let's say it also contains middleware that lowercases anything that passes through before it's returned. The implementations of these components could be as follows: class HelloWorld: def __init__(self, app, **kw): pass # nothing to configure def __call__(self, environ, start_response): start_response('200 OK', []) return ['Hello World'] class Lowercaser: def __init__(self, app, **kw): self.app = app # nothing else to configure def __call__(self, environ, start_response): for chunk in self.app(environ, start_response): yield chunk.lower() An import map would ship inside of the egg-info dir: [wsgi.app_factories] helloworld = helloworld:HelloWorld lowercaser = helloworld:Lowercaser So we install the egg and this does nothing except allow it to be used from within Python. But when we create a deployment descriptor like so in a text editor: [helloworld from helloworld] [lowercaser from helloworld] ... and run some starter script that parses that as a pipeline, creates the two instances, wires them together, and we get a running pipeline? Am I on track? OK, back to Battlestar Galactica ;-) On Mon, 2005-07-25 at 02:40 -0400, Chris McDonough wrote: BTW, a simple example that includes proposed solutions for all of these requirements would go a long way towards helping me (and maybe others) understand how all the pieces fit together. Maybe something like: - Define two simple WSGI components: a WSGI middleware and a WSGI application. - Describe how to package each as an indpendent egg. - Describe how to configure an instance of the application. - Describe how to configure an instance of the middleware - Describe how to string them together into a pipeline. - C On Mon, 2005-07-25 at 02:33 -0400, Chris McDonough wrote: Thanks... I'm still confused about high level requirements so please try to be patient with me as I try get back on track. These are the requirements as I understand them: 1. We want to be able to distribute WSGI applications and middleware (presumably in a format supported by setuptools). 3. We want to be able to configure a WSGI application in order to create an application instance. 2. We want a way to combine configured instances of those applications into pipelines and start an instance of a pipeline. Are these requirements the ones being discussed? If so, which of the config file formats we've been discussing matches which requirement? Thanks, - C On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote: At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote: Sorry, I think I may have lost track of where we were going wrt the deployment spec. Specifically, I don't know how we got to using eggs (which I'd really like to, BTW, they're awesome conceptually!) from where we were in the discussion about configuring a WSGI pipeline. What is a feature? What is an import map? Entry point? Should I just get more familiar with eggs to understand what's being discussed here or did I miss a few posts? I suggest this post as the shortest architectural introduction to the whole egg thang: http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html It explains pretty much all of the terminology I'm currently using, except for the new terms invented today... Entry points are a new concept, invented today by Ian and myself. Ian proposed having a mapping file (which I dubbed an import map) included in an egg's metadata, and then referring to named entries from a pipeline descriptor, so that you don't have to know or care about the exact name to import. The application or middleware factory name would be looked up in the egg's import map in order to find the actual factory object. I took Ian's proposal and did two things: 1) Generalized the idea to a concept of entry points. An entry point is a name that corresponds to an import specification, and an optional list of extras (see terminology link above) that the entry point may require. Entry point names exist in a namespace called an entry point group, and I implied that the WSGI deployment spec would define two such groups: wsgi.applications and wsgi.middleware, but a vast number of other possibilities for entry points and groups exist. In fact, I went ahead and implemented them in setuptools today, and realized I could use them to register setup commands with setuptools, making it extensible by any project that registers entry points in a 'distutils.commands' group. 2
Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config
Great. Given that, I've created the beginnings of a more formal specification: WSGI Deployment Specification - I use the term WSGI component in here as shorthand to indicate all types of WSGI implementations (application, middleware). The primary deployment concern is to create a way to specify the configuration of an instance of a WSGI component within a declarative configuration file. A secondary deployment concern is to create a way to wire up components together into a specific deployable pipeline. Pipeline Descriptors Pipeline descriptors are file representations of a particular WSGI pipeline. They include enough information to configure, instantiate, and wire together WSGI apps and middleware components into one pipeline for use by a WSGI server. Installation of the software which composes those components is handled separately. In order to define a pipeline, we use a .ini-format configuration file conventionally named 'something.wsgi'. This file may optionally be marked as executable and associated with a simple UNIX interpreter via a leading hash-bang line to allow servers which employ stdin and stdout streams (ala CGI) to run the pipeline directly without any intermediation. For example, a deployment descriptor named 'myapplication.wsgi' might be composed of the following text:: #!/usr/bin/runwsgi [mypackage.mymodule.factory1] quux = arbitraryvalue eekx = arbitraryvalue [mypackage.mymodule.factory2] foo = arbitraryvalue bar = arbitraryvalue Section names are Python-dotted-path names (or setuptools entry point names described in a later section) which represent factories. Key-value pairs within a given section are used as keyword arguments to the factory that can be used as configuration for the component being instantiated. All sections in the deployment descriptor describe 'middleware' except for the last section, which must describe an application. Factories which construct middleware must return something which is a WSGI callable by implementing the following API:: def factory(next_app, [**kw]): next_app is the next application in the WSGI pipeline, **kw is optional, and accepts the key-value pairs that are used in the section as a dictionary, used for configuration Factories which construct middleware must return something which is a WSGI callable by implementing the following API:: def factory([**kw]): **kw is optional, and accepts the key-value pairs that are used in the section as a dictionary, used for configuration A deployment descriptor can also be parsed from within Python. An importable configurator which resides in 'wsgiref' exposes a function that accepts a single argument, configure:: from wsgiref.runwsgi import parse_deployment appchain = parse_deployment('myapplication.wsgi') 'appchain' will be an object representing the fully configured pipeline. 'parse_deployment' is guaranteed to return something that implements the WSGI callable API described in PEP 333. Entry Points description of setuptools entry points goes here On Mon, 2005-07-25 at 10:39 -0400, Phillip J. Eby wrote: At 03:02 AM 7/25/2005 -0400, Chris McDonough wrote: Actually, let me give this a shot. We package up an egg called helloworld.egg. It happens to contain something that can be used as a WSGI component. Let's say it's a WSGI application that always returns 'Hello World'. And let's say it also contains middleware that lowercases anything that passes through before it's returned. The implementations of these components could be as follows: class HelloWorld: def __init__(self, app, **kw): pass # nothing to configure def __call__(self, environ, start_response): start_response('200 OK', []) return ['Hello World'] I'm thinking that an application like this wouldn't take an 'app' constuctor parameter, and if it takes no configuration parameters it doesn't need **kw, but good so far. class Lowercaser: def __init__(self, app, **kw): self.app = app # nothing else to configure def __call__(self, environ, start_response): for chunk in self.app(environ, start_response): yield chunk.lower() Again, no need for **kw if it doesn't take any configuration, but okay. An import map would ship inside of the egg-info dir: [wsgi.app_factories] helloworld = helloworld:HelloWorld lowercaser = helloworld:Lowercaser I'm thinking it would be more like: [wsgi.middleware] lowercaser = helloworld:Lowercaser [wsgi.apps] helloworld = helloworld:HelloWorld and you'd specify it in the setup script as something like this: setup( #... entry_points = { 'wsgi.apps
Re: [Web-SIG] WSGI deployment use case
On Mon, 2005-07-25 at 20:29 -0500, Ian Bicking wrote: We probably need something like a site map configuration, that can handle tree structure, and can specify pipelines on a per location basis, including the ability to specify pipeline components to be applied above everything under a certain URL pattern. This is more or less the same as my container API concept, but we are a little closer to being able to think about such a thing. It could also be something based on general matching rules, with some notion of precedence and how the rule effects SCRIPT_NAME/PATH_INFO. Or something like that. How much of this could be solved by using a web server's directory/alias-mapping facility? For instance, if you needed a single Apache webserver to support multiple pipelines based on URL mapping, wouldn't it be possible in many cases to compose that out of things like rewrite rules and script aliases (the below assumes running them just as CGI scripts, obviously it would be different with something using mod_python or what-have-you): VirtualHost *:80 ServerAdmin [EMAIL PROTECTED] ServerName plope.com ServerAlias plope.com ScriptAlias /viewcvs /home/chrism/viewcvs.wsgi ScriptAlias /blog /home/chrism/blog.wsgi RewriteEngine On RewriteRule ^/[^/]viewcvs*$ /home/chrism/viewcvs.wsgi [PT] RewriteRule ^/[^/]blog*$ /home/chrism/blog.wsgi [PT] /VirtualHost Obviously it would mean some repetition in wsgi files if you needed to repeat parts of a pipeline for each URL mapping. But it does mean we wouldn't need to invent more software. Of course, I still think it's something that can be added *after* having a basic deployment spec. I feel a very strong need that this be resolved before settling on anything deployment related. Not necessarily as a standard, but possibly as a set of practices. Even a realistic and concrete use case might be enough. I *think* more complicated use cases may revolve around attempting to use middleware as services that dynamize the pipeline instead of as oblivious things. I don't think there's anything really wrong with that but I also don't think it can ever be specified with as much clarity as what we've already got because IMHO it's a programming task. I'm repeating myself, I'm sure, but I'm more apt to put a service manager piece of middleware in the pipeline (or maybe just implement it as a library) which would allow my endpoint app to use it to do sessioning and auth and whatnot. I realize that is essentially building a framework (which is reviled lately) but since the endpoint app needs to collaborate anyway, I don't see a better way to do it except to rely completely on convention for service lookup (which is what you seem to be struggling with in the later bits of your post). - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Standardized configuration
On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote: To do this, we use a ConfigParser-format config file named 'myapplication.conf' that looks like this:: [application:sample1] config = sample1.conf factory = wsgiconfig.tests.sample_components.factory1 [application:sample2] config = sample2.conf factory = wsgiconfig.tests.sample_components.factory2 [pipeline] apps = sample1 sample2 I think it's confusing to call both these applications. I think middleware or filter would be better. I think people understand filter far better, so I'm inclined to use that. So... The reason I called them applications instead of filters is because all of them implement the WSGI application API (they all implement a callable that accepts two parameters, environ and start_response). Some happen to be gateways/filters/middleware/whatever but at least one is just an application and does no delegation. In my example above, sample2 is not a filter, it is the end-point application. sample1 is a filter, but it's of course also an application too. Would you maybe rather make it more explicit that some apps are also gateways, e.g.: [application:bleeb] config = bleeb.conf factory = bleeb.factory [filter:blaz] config = blaz.conf factory = blaz.factory ? I don't know that there's any way we could make use of the distinction between the two types in the configurator other than disallowing people to place an application before a filter in a pipeline through validation. Is there something else you had in mind? [application:sample2] # What is this relative to? I hate both absolute paths and # paths relative to pwd equally... config = sample1.conf factory = wsgiconfig... This was from a doctest I wrote so I could rely on relative paths, sorry. You're right. U... we could probably cause use the environment as defaults to ConfigParser inerpolation and set whatever we need before the configurator is run: $ export APP_ROOT=/home/chrism/myapplication $ ./wsgi-configurator.py myapplication.conf And in myapplication.conf: [application:sample1] config = %(APP_ROOT)s/sample1.conf factory = myapp.sample1.factory That would probably be the least-effort and most flexible thing to do and doesn't mandate any particular directory structure. Of course, we could provide a convention for a recommended directory structure, but this gives us an out from being painted in to that in specific cases. [pipeline] # The app is unique and special...? app = sample2 filters = sample1 Well, that's just a first refactoring; I'm having other inclinations... I'm not sure whether this is just a stylistic thing or if there's a reason you want to treat the endpoint app specially. By definition, in my implementation, the endpoint app is just the last app mentioned in the pipeline. Potential points of contention - The WSGI configurator assumes that you are willing to write WSGI component factories which accept a filename as a config file. This factory returns *another* factory (typically a class) that accepts the next application in the pipeline chain and returns a WSGI application instance. This pattern is necessary to support argument currying across a declaratively configured pipeline, because the WSGI spec doesn't allow for it. This is more contract than currently exists in the WSGI specification but it would be trivial to change existing WSGI components to adapt to this pattern. Or we could adopt a pattern/convention that removed one of the factories, passing both the next application and the config file into a single factory function. Whatever. In any case, in order to do declarative pipeline configuration, some convention will need to be adopted. The convention I'm advocating above seems to already have been for the current crop of middleware components (using a factory which accepts the application as the first argument). I hate the proliferation of configuration files this implies. I consider the filters an implementation detail; if they each have partitioned configuration then they become a highly exposed piece of the architecture. It's also a lot of management overhead. Typical middleware takes 0-5 configuration parameters. For instance, paste.profilemiddleware is perfectly usable with no configuration at all, and only has two parameters. True. The config file param should be optional. Apps might use the environment to configure themselves. But this is reasonably easy to resolve -- there's a perfectly good configuration section sitting there, waiting to be used: [filter:profile] factory = paste.profilemiddleware.ProfileMiddleware # Show top 50 functions: limit = 50 This in no way precludes 'config', which is just a special case of this general configuration. The only real problem is a possible conflict if we
Re: [Web-SIG] Standardized configuration
I've had a stab at creating a simple WSGI deployment implementation. I use the term WSGI component in here as shorthand to indicate all types of WSGI implementations (server, application, gateway). The primary deployment concern is to create a way to specify the configuration of an instance of a WSGI component, preferably within a declarative configuration file. A secondary deployment concern is to create a way to wire up components together into a specific deployable pipeline. A strawman implementation that solves both issues via the configurator, which would be presumed to live in wsgiref. Currently it lives in a package named wsgiconfig on my laptop. This module follows. Configurator for establishing a WSGI pipeline from ConfigParser import ConfigParser import types def configure(path): config = ConfigParser() if isinstance(path, types.StringTypes): config.readfp(open(path)) else: config.readfp(path) appsections = [] for name in config.sections(): if name.startswith('application:'): appsections.append(name) elif name == 'pipeline': pass else: raise ValueError, '%s is not a valid section name' app_defs = {} for appsection in appsections: app_config_file = config.get(appsection, 'config') app_factory_name = config.get(appsection, 'factory') app_name = appsection.split('application:')[1] if app_config_file is None: raise ValueError, ('application section %s requires a config ' 'option' % app_config_file) if app_factory_name is None: raise ValueError, ('application %s requires a factory' ' option' % app_factory_name) app_defs[app_name] = {'config':app_config_file, 'factory':app_factory_name} if not config.has_section('pipeline'): raise ValueError, 'must have a pipeline section in config' pipeline_str = config.get('pipeline', 'apps') if pipeline_str is None: raise ValueError, ('must have an apps definition in the ' 'pipeline section') pipeline_def = pipeline_str.split() next = None while pipeline_def: app_name = pipeline_def.pop() app_def = app_defs.get(app_name) if app_def is None: raise ValueError, ('appname %s os defined in pipeline ' '%s butno application is defined ' 'with that name') factory_name = app_def['factory'] factory = import_by_name(factory_name) config_file = app_def['config'] app_factory = factory(config_file) app = app_factory(next) next = app if not next: raise ValueError, 'no apps defined in pipeline' return next def import_by_name(name): if not . in name: raise ValueError(unloadable name: + `name`) components = name.split('.') start = components[0] g = globals() package = __import__(start, g, g) modulenames = [start] for component in components[1:]: modulenames.append(component) try: package = getattr(package, component) except AttributeError: n = '.'.join(modulenames) package = __import__(n, g, g, component) return package We configure a pipeline based on a config file, which creates and chains two sample WSGI applications together. To do this, we use a ConfigParser-format config file named 'myapplication.conf' that looks like this:: [application:sample1] config = sample1.conf factory = wsgiconfig.tests.sample_components.factory1 [application:sample2] config = sample2.conf factory = wsgiconfig.tests.sample_components.factory2 [pipeline] apps = sample1 sample2 The configurator exposes a function that accepts a single argument, configure. from wsgiconfig.configurator import configure appchain = configure('myapplication.conf') The sample_components module referred to in the 'myapplication.conf' file application definitions might look like this:: class sample1: middleware def __init__(self, app): self.app = app def __call__(self, environ, start_response): environ['sample1'] = True return self.app(environ, start_response) class sample2: end-point app def __init__(self, app): self.app = app def __call__(self, environ, start_response): environ['sample2'] = True return ['return value
Re: [Web-SIG] Standardized configuration
On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote: In addition to the examples I gave in response to Graham, I wrote a document on this a while ago: http://pythonpaste.org/docs/url-parsing-with-wsgi.html The hard part about this is configuration; it's easy to configure a non-branching chain of middleware. Once it branches the configuration becomes hard (like programming-hard; which isn't *hard*, but it quickly stops feeling like configuration). Yep. I think I'm getting it. For example, I see that Paste's URLParser seems to *construct* applications if they don't already exist based on the URL. And I assume that these applications could themselves be middleware. I don't think that is configurable declaratively if you want to decide which app to use based on arbitrary request parameters. But if we already had the config for each app instance that URLParser wanted to consult laying around as files on disk, wouldn't it be just as easy to construct these app objects eagerly at startup time? Then you URLParser could choose an already-configured app based on some sort of configuration file in the URLParser component itself. The apps themselves may be pipelines, too, I realize that, but that is still configurable without coding. Maybe there'd be some concern about needing to stop the process in order to add new applications. That's a use case I hadn't really considered. I suspect this could be done with a signal handler, though, which could tell the URLParser to reload its config file instead of potentially locating a and creating a new application within every request. This would make URLParser a kind of decision middleware, but it would choose from a static set of existing applications (or pipelines) for the lifetime of the process as opposed to constructing them lazily. OTOH, I'm not sure that I want my framework to find an app for me. I'd like to be able to define pipelines that include my app, but I'd typically just want to statically declare it as the end point of a pipeline composed of service middleware. I should look at Paste a little more to see if it has the same philosophy or if I'm misunderstanding you. Mostly I wanted to avoid lots of magical incantations for the simple case. If you are used to Webware, well it has a very straight-forward way of finding your application -- you give it a directory name. If Quixote or CherryPy, you give it a root object. Maybe Zope would take a ZEO connection string, and so on. I think I understand now. In general, I think I'd rather create instance locations of WSGI applications (which would essentially consist of a config file on disk plus any state info required by the app), configure and construct Python objects out of those instances eagerly at startup time and just choose between already-constructed apps if in decision middleware that has its own declarative configuration if decisions need to be made about which app to use. This is mostly because I want the configuration info to live within the application/middleware instance and have some other starter import those configurations from application/middleware instance locations on the filesystem. The starter would construct required instances as Python objects, and chain them together arbitrarily based on some other pipeline configuration file that lives with the starter. The first part of that (construct required instances) is described in a post I made to this list yesterday. This is probably because I'd like there to be one well-understood way to declaratively configure pipelines as opposed to each piece of middleware potentially needing to manage app construction and having its own configuration to do so. I don't know if this is reasonable for simpler requirements. This is more of a formal deployment spec idea and of course is likely flawed in some subtle way I don't understand yet. I'm pretty sure you're not advocating it, but in case you are, I'm not sure it adds as much value as it removes to be able to have a dynamic middleware chain whereby new middleware elements can be added on the fly to a pipeline after a request has begun. That is *very* late binding to me and it's impossible to configure declaratively. I'm comfortable with a little of both. I don't even know *how* I'd stop dynamic middleware. For instance, one of the methods I added to Wareweb recently allows any servlet to forward to any WSGI application; but from the outside the servlet looks like a normal WSGI application just like before. It's obviously fine if applications themselves want to do this. I'm not sure that it would be possible to create a deployment spec that canonized *how* to do it because as you mentioned it's not really a configuration task, it's a programming task. I agree! I'm a bit confused because one of the canonical examples of how WSGI middleware is useful seems to be the example of implementing a framework-agnostic
Re: [Web-SIG] Standardized configuration
On Sun, 2005-07-17 at 03:16 -0500, Ian Bicking wrote: This is what Paste does in configuration, like: middleware.extend([ SessionMiddleware, IdentificationMiddleware, AuthenticationMiddleware, ChallengeMiddleware]) This kind of middleware takes a single argument, which is the application it will wrap. In practice, this means all the other parameters go into lazily-read configuration. I'm finding it hard to imagine a reason to have another kind of middleware. Well, actually that's not true. In noodling about this, I did think it would be kind of neat in a twisted way to have decision middleware like: class DecisionMiddleware: def __init__(self, apps): self.apps = apps def __call__(self, environ, start_response): app = self.choose(environ) for chunk in app(environ, start_response): yield chunk def choose(self, environ): app = some_decision_function(self.apps, environ) I can imagine using this pattern as a decision point for a WSGI pipeline serving multiple application end-points (perhaps based on URL matching of the PATH_INFO in environ). But by and large, most middleware components seem to be just wrappers for the next application in the chain. There seem to be two types of middleware that takes a single application object as a parameter to its constructor. There is decorator middleware where you want to add something to the environment for an application to find later and action middleware that does some rewriting of the body or the response headers before the response is sent back to the client. Some of this kind of middleware does both. You can also define a framework (a plugin to Paste), which in addition to finding an app can also add middleware; basically embodying all the middleware that is typical for a framework. This appears to be what I'm trying to do too, which is why I'm intrigued by Paste. OTOH, I'm not sure that I want my framework to find an app for me. I'd like to be able to define pipelines that include my app, but I'd typically just want to statically declare it as the end point of a pipeline composed of service middleware. I should look at Paste a little more to see if it has the same philosophy or if I'm misunderstanding you. Paste is really a deployment configuration. Well, that as well as stuff to deploy. And two frameworks. And whatever else I feel a need or desire to throw in there. Yeah. FWIW, as someone who has recently taken a brief look at Paste, I think it would be helpful (at least for newbies) to partition out the bits of Paste which are meant to be deployment configuration from the bits that are meant to be deployed. Zope 2 fell into the same trap early on, and never recovered. For example, ZPublisher (nee Bobo) was always meant to be able to be useful outside of Zope, but in practice it never happened because nobody could figure out how to disentangle it from its ever-increasing dependencies on other software only found in a Zope checkout. In the end, nobody even remembered what its dependencies were *supposed* to be. If you ask ten people, you'd get ten different answers. I also think that the rigor of separating out different components helps to make the software stronger and more easily understood in bite-sized pieces. Unfortunately, separating them makes configuration tough, but I think that's what we're trying to find an answer about how to do the right way here. Note also that parts of the pipeline are very much late bound. For instance, the way I implemented Webware (and Wareweb) each servlet is a WSGI application. So while there's one URLParser application, the application that actually handles the request differs per request. If you start hanging more complete applications (that might have their own middleware) at different URLs, then this happens more generally. Well, if you put the decider in middleware itself, all of the middleware components in each pipeline could still be at least constructed early. I'm pretty sure this doesn't really strictly qualify as early binding but it's not terribly dynamic either. It also makes configuration pretty straightforward. At least I can imagine a declarative syntax for configuring pipelines this way. I'm pretty sure you're not advocating it, but in case you are, I'm not sure it adds as much value as it removes to be able to have a dynamic middleware chain whereby new middleware elements can be added on the fly to a pipeline after a request has begun. That is *very* late binding to me and it's impossible to configure declaratively. But some elements of the pipeline at this level of factoring do need to have dependencies on availability and pipeline placement of the other elements. In this example, proper operation of the authentication component depends on the availability and pipeline placement of the identification component. Likewise, the identification component may
[Web-SIG] Standardized configuration
I've also been putting a bit of thought into middleware configuration, although maybe in a different direction. I'm not too concerned yet about being able to introspect the configuration of an individual component. Maybe that's because I haven't thought about the problem enough to be concerned about it. In the meantime, though, I *am* concerned about being able to configure a middleware pipeline easily and have it work. I've been attempting to divine a declarative way to configure a pipeline of WSGI middleware components. This is simple enough through code, except that at least in terms of how I'm attempting to factor my middleware, some components in the pipeline may have dependencies on other pipeline components. For example, it would be useful in some circumstances to create separate WSGI components for user identification and user authorization. The process of identification -- obtaining user credentials from a request -- and user authorization -- ensuring that the user is who he says he is by comparing the credentials against a data source -- are really pretty much distinct operations. There might also be a challenge component which forces a login dialog. In practice, I don't know if this is a truly useful separation of concerns that need to be implemented in terms of separate components in the middleware pipeline (I see that paste.login conflates them), it's just an example. But at very least it would keep each component simpler if the concerns were factored out into separate pieces. But in the example I present, the authentication component depends entirely on the result of the identification component. It would be simple enough to glom them together by using a distinct environment key for the identification component results and have the authentication component look for that key later in the middleware result chain, but then it feels like you might as well have written the whole process within one middleware component because the coupling is pretty strong. I have a feeling that adapters fit in here somewhere, but I haven't really puzzled that out yet. I'm sure this has been discussed somewhere in the lifetime of WSGI but I can't find much in this list's archives. Lately I've been thinking about the role of Paste and WSGI and whatnot. Much of what makes a Paste component Pastey is configuration; otherwise the bits are just independent pieces of middleware, WSGI applications, etc. So, potentially if we can agree on configuration, we can start using each other's middleware more usefully. I think we should avoid questions of configuration file syntax for now. Lets instead simply consider configuration consumers. A standard would consist of: * A WSGI environment key (e.g., 'webapp01.config') * A standard for what goes in that key (e.g., a dictionary object) * A reference implementation of the middleware * Maybe a non-WSGI-environment way to access the configuration (like paste.CONFIG, which is a global object that dispatches to per-request configuration objects) -- in practice this is really really useful, as you don't have to pass the configuration object around. There's some other things we have to consider, as configuration syntaxes do effect the configuration objects significantly. So, the standard for what goes in the key has to take into consideration some possible configuration syntaxes. The obvious starting place is a dictionary-like object. I would suggest that the keys should be valid Python identifiers. Not all syntaxes require this, but some do. This restriction simply means that configuration consumers should try to consume Python identifiers. There's also a question about name conflicts (two consumers that are looking for the same key), and whether nested configuration should be preferred, and in what style. Note that the standard we decide on here doesn't have to be the only way the object can be accessed. For instance, you could make your configuration available through 'myframework.config', and create a compliant wrapper that lives in 'webapp01.config', perhaps even doing different kinds of mapping to fix convention differences. There's also a question about what types of objects we can expect in the configuration. Some input styles (e.g., INI and command line) only produce strings. I think consumers should treat strings (or maybe a special string subclass) specially, performing conversions as necessary (e.g., 'yes'-True). Thoughts? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Standardized configuration
On Sat, 2005-07-16 at 23:29 -0500, Ian Bicking wrote: There's nothing in WSGI to facilitate introspection. Sometimes that seems annoying, though I suspect lots of headaches are removed because of it, and I haven't found it to be a stopper yet. The issue I'm interested in is just how to deliver configuration to middleware. Whew, I hoped you'd respond. ;-) It appears that I haven't gotten as far as to want introspection into the implementation or configuration of a middleware component. Instead, I want the ability to declaratively construct a pipeline out of largely opaque and potentially interdependent (but loosely coupled) WSGI middleware components, which is another problem entirely. It seemed cogent, so I just somewhat belligerently coopted this thread, sorry! Because middleware can't be introspected (generally), this makes things like configuration schemas very hard to implement. It all needs to be late-bound. The pipeline itself isn't really late bound. For instance, if I was to create a WSGI middleware pipeline something like this: server -- session -- identification -- authentication -- -- challenge -- application ... session, identification, authentication, and challenge are middleware components (you'll need to imagine their implementations). And within a module that started a server, you might end up doing something like: def configure_pipeline(app): return SessionMiddleware( IdentificationMiddleware( AuthenticationMiddleware( ChallengeMiddleware(app) if __name__ == '__main__': app = Application() pipeline = configure_pipeline(app) server = Server(pipeline) server.serve() The pipeline is static. When a request comes in, the pipeline itself is already constructed. I don't really want a way to prevent improper pipeline construction at startup time (right now anyway), because failures due to missing dependencies will be fairly obvious. But some elements of the pipeline at this level of factoring do need to have dependencies on availability and pipeline placement of the other elements. In this example, proper operation of the authentication component depends on the availability and pipeline placement of the identification component. Likewise, the identification component may depend on values that need to be retrieved from the session component. I've just seen Phillip's post where he implies that this kind of fine-grained component factoring wasn't really the initial purpose of WSGI middleware. That's kind of a bummer. ;-) Factoring middleware components in this way seems to provide clear demarcation points for reuse and maintenance. For example, I imagined a declarative security module that might be factored as a piece of middleware here: http://www.plope.com/Members/chrism/decsec_proposal . Of course, this sort of thing doesn't *need* to be middleware. But making it middleware feels very right to me in terms of being able to deglom nice features inspired by Zope and other frameworks into pieces that are easy to recombine as necessary. Implementations as WSGI middleware seems a nice way to move these kinds of features out of our respective applications and into more application-agnostic pieces that are very loosely coupled, but perhaps I'm taking it too far. For example, it would be useful in some circumstances to create separate WSGI components for user identification and user authorization. The process of identification -- obtaining user credentials from a request -- and user authorization -- ensuring that the user is who he says he is by comparing the credentials against a data source -- are really pretty much distinct operations. There might also be a challenge component which forces a login dialog. I've always thought that a 401 response is a good way of indicating that, but not everyone agrees. (The idea being that the middleware catches the 401 and possibly translates it into a redirect or something.) Yep. That'd be a fine signaling mechanism. In practice, I don't know if this is a truly useful separation of concerns that need to be implemented in terms of separate components in the middleware pipeline (I see that paste.login conflates them), it's just an example. Do you mean identification and authentication (you mention authorization above)? Aggh. Yes, I meant to write authentication, sorry. I think authorization is different, and is conflated in paste.login, but I don't have any many use cases where it's a useful distinction. I guess there's a number of ways of getting a username and password; and to some degree the authenticator object works at that level of abstraction. And there's a couple other ways of authenticating a user as well (public keys, IP address, etc). I've generally used a user manager object for this kind of abstraction, with subclassing for different kinds of generality (e.g., the basic abstract