from:"Chris McDonough"

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-22 Thread Chris McDonough

On Wed, 2012-02-22 at 09:06 +1100, Graham Dumpleton wrote:
 If you want to be able to control a thread like that from an atexit
 callback, you need to create the thread as daemonised. Ie.
 setDaemon(True) call on thread.
 
 By default a thread will actually inherit the daemon flag from the
 parent. For a command line Python where thread created from main
 thread it will not be daemonised and thus why the thread will be
 waited upon on shutdown prior to atexit being called.
 
 If you ran the same code in mod_wsgi, my memory is that the thread
 will actually inherit as being daemonised because request handler in
 mod_wsgi, from which import is trigger, are notionally daemonised.
 
 Thus the code should work in mod_wsgi. Even so, to be portable, if
 wanting to manipulate thread from atexit, make it daemonised.
 
 Example of background threads in mod_wsgi at:
 
 http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode#Monitoring_For_Code_Changes
 
 shows use of setDaemon().
 
 Graham

I've read all the messages in this thread and the traffic on the bug
entry at http://bugs.python.org/issue14073 but I'm still not sure what
to tell people who want to invoke code at shutdown.

Do we tell them to use atexit?  If so, are we saying that atexit is
sufficient for all user-defined shutdown code that needs to run save for
code that needs to stop threads?

Is it sufficient to define shutdown as when the process associated
with the application exits?  It still seems to not necessarily be
directly correlated.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-20 Thread Chris McDonough

On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote:
 The standard way to do this would be to define an optional server
 extension API supplied in the environ; for example, a
 'x-wsgiorg.register_shutdown' function.

Unlikely, AFACIT, as shutdown may happen when no request is active.
Even if this somehow happened to not be the case, asking the application
to put it in the environ is not useful, as the environ can't really be
relied on to retain values up the call stack.

- C


   The wsgi.org wiki used to be the place to propose these sorts of
 things for standardization, but it appears to no longer be a wiki, so
 the mailing list is probably a good place to discuss such a proposal.
 
 On Mon, Feb 20, 2012 at 2:30 PM, Tarek Ziadé ziade.ta...@gmail.com
 wrote:
 oops my examples were broken, should be:
 
 def hello_world_app(environ, start_response): status = '200
 OK' # HTTP Status headers = [('Content-type', 'text/plain')]
 start_response(status, headers) return [Hello World] 
 
 def shutdown():   # or maybe something else as an argument I
 don't know
 do_some_cleanup()
 
 
 
 and:
 
 $ gunicorn myapp:hello_world_app myapp:shutdown
 
 
 
 Cheers
 Tarek
 
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe:
 http://mail.python.org/mailman/options/web-sig/pje%
 40telecommunity.com
 
 
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-20 Thread Chris McDonough

On Mon, 2012-02-20 at 20:54 -0500, PJ Eby wrote:
 2012/2/20 Chris McDonough chr...@plope.com
 On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote:
  The standard way to do this would be to define an optional
 server
  extension API supplied in the environ; for example, a
  'x-wsgiorg.register_shutdown' function.
 
 
 Unlikely, AFACIT, as shutdown may happen when no request is
 active.
 Even if this somehow happened to not be the case, asking the
 application
 to put it in the environ is not useful, as the environ can't
 really be
 relied on to retain values up the call stack.
 
 
 Optional server extension APIs are things that the server puts in
 the environ, not things the app puts there.  That's why it's
 'register_shutdown', e.g.
 environ['x-wsgiorg.register_shutdown'](shutdown_function).  

I get it now, but it's still not the right thing I don't think.  Servers
shut down without issuing any requests at all.

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

[Web-SIG] PEP3333 and PATH_INFO

2012-01-03 Thread Chris McDonough

Perrenial topic, it seems, from the archives.

As far as I can tell from PEP , every WSGI application that wants to
run on both Python 2 and Python 3 and which uses PATH_INFO will need to
define a helper function something like this:


import sys

def decode_path_info(environ, encoding='utf-8'):
PY3 = sys.version_info[0] == 3
path_info = environ['PATH_INFO']
if PY3:
return path_info.encode('latin-1').decode(encoding)
else:
return path_info.decode(encoding)


Is there a more elegant way to handle this?

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

[Web-SIG] wsgi server...

2011-12-26 Thread Chris McDonough

Does anyone know of a pure-Python WSGI server that:

- Is distributed indepdently from a web framework or larger whole.

- Runs on UNIX and Windows.

- Runs on both Python 2 and Python 3.

- Has good test coverage.

- Is useful in production.

(I sent this already to the Pylons-discuss maillist and got some good
responses, so not ignoring those, just want to ask a wider audience)

Thanks!

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-02 Thread Chris McDonough

On Sun, 2011-01-02 at 09:21 -0800, Guido van Rossum wrote:
 Graham, I hope that you can stop being grumpy about the process that
 is being followed and start using your passion to write up a critique
 of the technical merits of Alice's draft. You don't have to attack the
 whole draft at once -- you can start by picking one or two important
 issues and try to guide a discussion here on web-sig to tease out the
 best solutions.  Please  understand that given the many different ways
 people use and implement WSGI there may be no perfect solution within
 reach -- writing a successful standard is the art of the compromise.
 (If you still think the process going forward should be different,
 please write me off-list with your concerns.)
 
 Everyone else on this list, please make a new year's resolution to
 help the WSGI 2.0 standard become a reality in 2011.

I think Graham mostly has an issue with this thing being called WSGI
2.

FTR, avoiding naming arguments is why I titled the original PEP Web3.
I knew that if I didn't (even though personally I couldn't care less if
it was called Buick or McNugget), people would expend effort arguing
about the name rather than concentrate on the process of creating a new
standard.  They did anyway of course; many people argued publically
wishing to rename Web3 to WSGI2.  On balance, though, I think giving the
standard a neutral name before it's widely accepted as a WSGI
successor was (and still is) a good idea, if only as a conflict
avoidance strategy. ;-)

That said, I have no opinion on the technical merits of the new PEP 444
draft; I've resigned myself to using derivatives of PEP  forever.
It's good enough.  Most of the really interesting stuff seems to happen
at higher levels anyway, and the benefit of a new standard doesn't
outweigh the angst caused by trying to reach another compromise.  I'd
suggest we just embrace it, adding minor tweaks as necessary, until we
reach some sort of technical impasse it doesn't address.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444

2010-11-21 Thread Chris McDonough

PEP 444 has no champion currently.  Both Armin and I have basically left
it behind.  It would be great if you wanted to be its champion.

- C

On Sun, 2010-11-21 at 03:12 -0800, Alice Bevan-McGregor wrote:
 (A version of this is is available at http://web-core.org/2.0/pep-0444/ — 
 links are links, code may be easier to read.)
 
 PEP 444 is quite exciting to me.  So much so that I’ve been spending a few 
 days writing a high-performance (C10K, 10Krsec) Py2.6+/3.1+ HTTP/1.1 server 
 which implements much of the proposed standard.  The server is functional 
 (less web3.input at the time of this writing), but differs from PEP 444 in 
 several ways.  It also adds several features I feel should be part of the 
 spec.
 
 Source for the server is available on GitHub:
 
   https://github.com/pulp/marrow.server.http
 
 I have made several notes about the PEP 444 specification during 
 implementation of the above, and concern over some implementation details:
 
 First, async is poorly defined:
 
  If the origin server advertises that it has the web3.async capability, a 
  Web3 application callable used by the server is permitted to return a 
  callable that accepts no arguments. When it does so, this callable is to be 
  called periodically by the origin server until it returns a non-None 
  response, which must be a normal Web3 response tuple.
 
 Polling is not true async.  I believe that it should be up to the server to 
 define how async is utilized, and that the specification should be clarified 
 on this point.  (“Called periodically” is too vague.)  “Callable” should 
 likely be redefined as “generator” (a callable that yields) as most 
 applications require holding on to state and wrapping everything in 
 functools.partial() is somewhat ugly.  Utilizing generators would improve 
 support for existing Python async frameworks, and allow four modes of 
 operation: yield None (no response, keep waiting), yield response_tuple 
 (standard response), return / raise StopIteration (close the async 
 connection) and allow for data to be passed back to the async callable by the 
 higher-level async framework.
 
 Second, WSGI middleware, while impressive in capability, are somewhat… 
 heavy-weight.  Heavily nesting function calls is wasteful of CPU and RAM, 
 especially if the middleware decides it can’t operate, for example, GZip 
 compression disabling itself for non-text/ mimetypes.  The majority of WSGI 
 middleware can, and probably should be, implemented as linear ingress or 
 egress filters.  For example, on-disk static file serving could be an ingress 
 filter, and GZip compression an egress filter.  m.s.http supports this 
 filtering and demonstrates one API for such.  Also, I am in the process of 
 writing an example egress CompressionFilter.
 
 An example API and filter use implementation: (paraphrased from 
 marrow.server.http)
 
  # No filters, near 0 overhead.
  for filter_ in ingress_filters:
  # Can mutate the environment.
  result = filter_(env)
  
  # Allow the filter to return a response rather than continuing.
  if result:
  # result is a status, headers, body_iter tuple
  return result[0], result[1], result[2]
  
  status, headers, body = application(env)
  
  for filter_ in egress_filters:
  # Can mutate the environment, status, headers, body, or
  # return completely new status, headers, and body.
  status, headers, body = filter_(env, status, headers, body)
  
  return status, headers, body
 
 The environment has some minor issues.  I’ll write up my changes in RFC-style:
 
 SERVER_NAME is REQUIRED and MUST contain the DNS name of the server OR 
 virtual server name for the web server if available OR an empty bytestring if 
 DNS resolution is unavailable.  SERVER_ADDR is REQUIRED and MUST contain the 
 web server’s bound IP address.  URL reconstruction SHOULD use HTTP_HOST if 
 available, SERVER_NAME if there is no HTTP_HOST, and fall back on SERVER_ADDR 
 if SERVER_NAME is an empty bytestring.
 
 CONTENTL_LENGTH is REQUIRED and MUST be None if not defined by the client.  
 Testing explicitly for None is more efficient than armoring against missing 
 values; also, explicit is better than implicit.  (Paste’s WSGI1 server 
 defines CONTENT_LENGTH as 0, but this implies the client explicitly declared 
 it as zero, which is not the case.)
 
 FRAGMENT and PARAMETERS are REQUIRED and are parsed out of the URL in the 
 same way as the QUERY_STRING. FRAGMENT is the text after a hash mark (a.k.a. 
 “anchor” to browsers, e.g. /foo#bar). PARAMETERS come before QUERY_STRING, 
 and after PATH_INFO separated by a semicolon, e.g. /foo;bar?baz.  Both values 
 MUST be empty bytestrings if not present in the URL. (Rarely used — I’ve only 
 seen it in Java and ColdFusion applications — but still useful.)
 
 Points of contention:
 
 Changing the namespace seems needless.  Using the wsgi.* namespace with a 
 wsgi.version of (2, 0) will allow applications to easily

Re: [Web-SIG] PEP 444

2010-11-21 Thread Chris McDonough

On Sun, 2010-11-21 at 09:32 -0800, Alice Bevan-McGregor wrote:
  PEP 444 has no champion currently.  Both Armin and I have basically left it 
  behind.  It would be great if you wanted to be its champion.
 
 Done.
 
 As I already have a functional, performant HTTP server[1] and example 
 filter[2] (compression) utilizing a slightly modified version of PEP 444, and 
 hope to be giving a presentation on its design and related utilities[3] early 
 next year, I’d love to have the opportunity to directly shape its future.  My 
 server may be a bit large to be a reference implementation, but until it has 
 its first user I have the benefit of being able to experiment whole-heartedly 
 with features and proposals.
 
 Since Python 3 was released I haven’t heard of much forward-progress in 
 getting web frameworks compatible.  The largest complaint I’ve heard is that 
 there are too few things already ported, which is a chicken and the egg 
 problem.  This is one scenario where re-inventing the wheel may be the only 
 way to see forward movement.  So far, I seem to be buckling down and Getting 
 Things Done™ in this regard.
 
 How would I go about getting access to the PEP in order to fix the issues 
 I’ve been catching up on?  (I’ve been reading through quite a bit of old 
 mailing list traffic these last few hours in-between writing docs and unit 
 tests for the compression egress filter.)

Georg Brandl has thus far been updating the canonical PEP on python.org.
I don't know how you get access to that.  My working copy is at
https://github.com/mcdonc/web3 .

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-24 Thread Chris McDonough

On Sun, 2010-10-24 at 17:16 +0200, Georg Brandl wrote:
 Am 24.10.2010 16:40, schrieb Chris McDonough:
  On Sun, 2010-10-24 at 10:17 +0300, Armin Ronacher wrote:
  
  I have to admit that my interest in Python 3 is not very high and I am 
  most likely not the most reliable person when it comes to driving PEP 444 
  :)
  
  We should probably withdraw the PEP, then (unless someone else wants to
  step up and champion it), because neither am I.
 
 Don't give it up yet -- Deferring is probably the better option.

TBH, unless someone has immediate interest in championing it, I'd rather
just withdraw it and let someone else resubmit it (or something like it)
later if they want.  It's just going to cause confusion if it's left in
a zombie state without a champion.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-22 Thread Chris McDonough

For what it's worth, I'm happy with the changes made to WSGI 1 that
produced PEP .

I'm unlikely to champion PEP 444 going forward.  It has already served
its primary duty to me personally (which was to catalyze the
formalization of some specification that is Python 3 inclusive).

However, Armin may feel differently about it, so this doesn't constitute
a withdrawal of PEP 444.  I'm instead just signaling my own personal
attitude: don't really care as much now that there's something out
there.

On Fri, 2010-10-22 at 10:35 +1100, Graham Dumpleton wrote:
 Any one care to comment on my blog post?
 
   http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html
 
 As far as web framework developers commenting, Armin at:
 
   
 http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/
 
 has said:
 
   Hopefully not. WSGI could do better and there is a proposal for
 that (444).
 
 So, looks he is very cool on the idea.
 
 No other developers of actual web frameworks has commented at all on
 PEP  from what I can see.
 
 Graham
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-21 Thread Chris McDonough

I have some pending changes to the PEP 444 spec (the working copy is at
http://github.com/mcdonc/web3/blob/master/pep-0444.rst but please don't
consider that canonical in any sense, it will change before an official
republication of the proposal).  The modifications fold in most of what
we've talked about on the list, or at least acknowledge the issues; a
change log is contained near the top.

However, I'm currently trying work work through what to do about
offering up quoted PATH_INFO and SCRIPT_NAME values (unquoted in the
sense that, at least on platforms that support it, these would be the
original values before being run through urllib.unquote).

The current published proposal on Python.org indicates that these would
go into web3.path_info and web3.script_name but nobody seems to much
like that because it would make things like path_info_pop hard (the
code would need to keep two data structures in sync, and would need to
be pretty magical in the face of %2F markers).

The pending, unpublished proposal turns SCRIPT_NAME and PATH_INFO into
*quoted* values, and adds a ``web3.path_requoted`` flag for debugging
purposes, which will be True if the SCRIPT_NAME and/or PATH_INFO needed
to be recomposed and requoted (eg. on CGI platforms).  But private
conversations lead me to believe that not many folks will like this
either, because it comandeers CGI names that are well-understood to be
unquoted.

The only sensible way to break the deadlock seems to be to not use any
CGI names in the specification at all, so as not to break people's
expectations.  I know that when I change it to not use any CGI names, it
will be received poorly, but I can't think of a better idea.

- C

On Wed, 2010-09-15 at 19:03 -0400, Chris McDonough wrote:
 A PEP was submitted and accepted today for a WSGI successor protocol
 named Web3:
 
 http://python.org/dev/peps/pep-0444/
 
 I'd encourage other folks to suggest improvements to that spec or to
 submit a competing spec, so we can get WSGI-on-Python3 settled soon.
 
 - C
 
 
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough

On Thu, 2010-09-16 at 05:29 +0200, Roberto De Ioris wrote:
 About the *.file_wrapper removal, i suggest
 a PSGI-like approach where 'body' can contains a File Object.
 
 def file_app(environ):
 fd = open('/tmp/pippo.txt', 'r')
 status = b'200 OK'
 headers = [(b'Content-type', b'text/plain')]
 body = fd
 return body, status, headers

I don't see why this couldn't work as long as middleware didn't convert
the body into something not-file-like.  But it is really an
implementation detail of the origin server (it might specialize when the
body is a file), and doesn't really need to be in the spec.

 or
 
 def file_app(environ):
 fd = open('/tmp/pippo.txt', 'r')
 status = b'200 OK'
 headers = [(b'Content-type', b'text/plain')]
 body = [b'Header', fd, b'Footer']
 return body, status, headers

This won't work, as the body is required to return an iterable which
returns bytes, and cannot be an iterable which returns either bytes or
other iterables (it must be a flat sequence).

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough

On Sun, 2010-09-19 at 21:52 -0400, Chris McDonough wrote:

 I'm -0 on the server trying to guess the Content-Length header.  It just
 doesn't seem like much of a burden to place on an application and it's
 easier to specify that an application must do this than it is to specify
 how a server should behave in the face of a missing Content-Length.  I
 also believe Graham has argued against making the server guess, I
 presume this causes him some pain somehow (probably underspecification
 in WSGI).

Graham's issues with requiring the server to set Content-Length are
detailed here:

http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-17 Thread Chris McDonough

On Fri, 2010-09-17 at 19:47 +0300, Ionel Maries Cristian wrote:
 I don't like this proposal at all. Besides having to go through the
 bytes craziness the design is pretty backwards for middleware and
 asynchronous applications.

We've acknowledged in other messages to this thread that the web3.async
red herring is speculative, and Armin has indicated that if he does not
find a champion willing to create a reference implementation for it
today that it will be taken out.  This doesn't help async people, but it
also doesn't harm them (no difference from WSGI really).  Personally, I
hope nobody steps up and we just rip it out. ;-)

I'm not sure why you characterize using bytes as bytes craziness.  We
have been using strings as byte sequences in WSGI for over five years.
Python itself draws an equivalence between the Python 3 bytes type and
Python 2 str (bytes is aliased to str under Python 2).  I'm not
really sure why we shouldn't take advantage of that equivalence, and why
people are so enamored of treating envvar values, headers, and such as
text other than the brokenness of the Python 3 stdlib urllib stuff.  

IMO, WSGI/Web3 isn't really a programming platform (or at least if it
is, it is destined to be a pretty crappy one), it's just a connection
protocol, so any its more typing or its ugly argument seems pretty
thin to me.  I'd personally rather have it be more general and less easy
to use than potentially broken in some corner case circumstance.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread Chris McDonough

On Thu, 2010-09-16 at 12:01 -0500, Ian Bicking wrote:
 Well, reiterating some things I've said before:
 
 * This is clearly just WSGI slightly reworked, why the new name?

The PEP says Web3 is clearly a WSGI derivative; it only uses a
different name than WSGI in order to indicate that it is not in any
way backwards compatible.

I don't really care what the name is.  My experience in various
communities suggests that naming the new totally-bw-incompat thing the
same as the old thing weakens both the new thing and the old thing,
but.. whatever.  I just don't care much.

 * Why byte values in the environ?  No one has offered any real reason
 they are better than native strings.  I keep asking people to offer a
 reason, *and no one ever does*.  It's just hyperbole and distraction.
 Frankly I'm feeling annoyed.  So far my experience makes me believe
 using native strings will make it easier to port and support libraries
 across 2 and 3.

I'm sorry you're annoyed.  I chose bytes here mainly out of ignorance
and fear. This is an extremely low level protocol, and I just literally
don't know how we can sanely convert environ values to Unicode without
some loss of control or potential for incorrect decoding without having
server encoding configuration.  You say it's easy and straightforward,
and that's fine.  I just haven't internalized enough specification to
know.

I'd very much encourage folks who want to use native strings to create
another PEP: it's just a lot easier to argue about one thing than it
is to argue endlessly in snippets on blogs and epic maillist threads.  I
could care less if this *particular* PEP is selected, to be honest.
Let's just get it over within a process where there's at least some
chance of resolution.

 * It makes sense to me that the error stream should accept both bytes
 and unicode, and should do a best effort to handle either.  Getting
 encoding errors or type errors when logging an error is very
 distracting.

Sounds good.

 * Instead of focusing on Response(*response_tuple), I'd rather just
 rely on something like Response.from_wsgi(response_tuple).  Body first
 feels very unnatural.

Others have said same, also good.

 * Regarding long response headers, I think we should ignore the HTTP
 spec.  You can put 4k in a Set-Cookie header, such headers aren't
 easily or safely folded... I think the line length constraint in the
 HTTP spec isn't a constraint we need to pay attention to.

OK.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread Chris McDonough

On Thu, 2010-09-16 at 14:04 -0400, P.J. Eby wrote:
 At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote:
 No comments on the rest except to note that at this point it looks
 unlikely that we can make everyone happy (or even get an agreement to
 adopt what would be the long-term technically optimal solution --
 AFAICT there is no agreement on what that solution would be, if one
 weren't to take porting Python 2 code into account). IOW
 something/sokebody has gotta give.
 
 Indeed.  This entire discussion has pushed me strongly in favor of 
 doing a super-minimalist update to PEP 333 with the following points:

Right on, write it all down! ;-)

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-15 Thread Chris McDonough

On Wed, 2010-09-15 at 20:05 -0400, P.J. Eby wrote:
 At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote:
 A PEP was submitted and accepted today for a WSGI successor protocol
 named Web3:
 
 http://python.org/dev/peps/pep-0444/
 
 I'd encourage other folks to suggest improvements to that spec or to
 submit a competing spec, so we can get WSGI-on-Python3 settled soon.
 
 The first thing I notice is that web3.async appears to force all 
 existing middleware to delete it from the environment if it wishes to 
 remain compatible, unless it adapts to support receiving callables itself.

We can ditch everything concerning web3.async as far as I'm concerned.
Ian has told me that this feature won't be liked by the async people
anyway, as it doesnt have a trigger mechanism.

 On further reading I see you have something about middleware 
 disabling itself if it doesn't support async execution, but this 
 doesn't make any sense to me: if it can't support async execution, 
 why wouldn't it just delete web3.async from the environ, forcing its 
 wrapped app to be synchronous instead?
 
 I'm also not a fan of the bytes environ, or the new 
 path_info/script_name variables; note that the spec's sample CGI 
 implementation does not itself provide the new variables, and that 
 middleware must be explicitly written to handle the case where there 
 is duplication.

I'm not concerned about which environment variables have it, but I would
definitely like to be able to get at the original (non-%2F-decoded)
path info somewhere.  I'd be fine if PATH_INFO was just that, and get
rid of web3.path_info.  web3.script_name is probably just a mistake
entirely.

 My main fear with this spec is that people will assume they can just 
 make a few superficial changes to run WSGI code on it, when in fact 
 it is deeply incompatible where middleware is concerned.  In fact, 
 AFAICT, it seems like it will be *harder* to write correct web3 
 middleware than it is to write correct WSGI middleware now.

I'm very willing to drop web3.async entirely.  It seems reasonable to do
so.  I should have done so before I mailed the spec, as I knew it would
be unpopular.

 This seems like a step backward, since the whole idea behind dropping 
 start_response() was to make correct middleware *easier* to write.
 
 Any time a spec makes something optional or allows More Than One Way 
 To Do It, it immediately doubles the mimimum code required to 
 implement that portion of the spec in compliant middleware.  This 
 spec has two optionalities: web3.async, and the optional 
 path_info/script_name, so the return handling of every piece of 
 middleware is doubled (or else  environ['web3.async'] = False must 
 be added at the top), and any code that modifies paths must similarly 
 ditch the special variables or do double work to update them.

No worries, let's get rid of both, with the caveat that it's pretty
essential (to me anyway) to be able to get at the non-%2F-encoded path
somewhere.  The most sensible thing to me would be to put it in
PATH_INFO.

As far as bytes vs. strings, whatever, we have to pick one.  Bytes makes
more sense to me.  I'll leave it to the native-string and/or unicode
people to create their own spec.

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] [Python-Dev] Add PEP 444, Python Web3 Interface.

2010-09-15 Thread Chris McDonough

It's, e.g.

b'8080'

.. instead of the integer value 8080.

Apparently the type of this value was not spelled out sufficiently in
the WSGI spec and string values and integer values were used
interchangeably, making it harder to join them with the other values in
the environ (a common thing to want to do).  Bytes instances are
attractive, as the rest of the values are also bytes, so they can be
joined together easily.

(I also redirected this to web-sig at the request of PJE).

- C

On Wed, 2010-09-15 at 17:02 -0700, John Nagle wrote:
 On 9/15/2010 4:44 PM, python-dev-requ...@python.org wrote:
  ``SERVER_PORT`` must be a bytes instance (not an integer).
 
 What's that supposed to mean?  What goes in the bytes
 instance?  A character string in some format?  A long binary
 number?  If the latter, with which byte ordering?  What
 problem does this\ solve?
 
   John Nagle
 
 ___
 Python-Dev mailing list
 python-...@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/lists%40plope.com
 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-17 Thread Chris McDonough

On Fri, 2010-07-16 at 23:38 -0500, Ian Bicking wrote:
 On Fri, Jul 16, 2010 at 9:43 PM, Chris McDonough chr...@plope.com
 wrote:
 
  Nah, not nearly that hard:
 
  path_info =
 
 
 urllib.parse.unquote_to_bytes(environ['wsgi.raw_path_info']).decode('UTF-8')
 
  I don't see the problem?  If you want to distinguish %2f
 from /, then
  you'll do it slightly differently, like:
 
  path_parts = [
  urllib.parse.unquote_to_bytes(p).decode('UTF-8')
  for p in environ['wsgi.raw_path_info'].split('/')]
 
  This second recipe is impossible to do currently with WSGI.
 
  So... before jumping to conclusions, what's the hard part
 with using
  text?
 
 
 It's extremely hard to swallow Python 3's current disregard
 for the
 primacy of bytes at I/O boundaries.  I'm trying, but I can't
 help but
 feel that the existence of an API like unquote_to_bytes is
 more
 symptom treatment than solution.  Of course something that
 unquotes a
 URL segment unquotes it into bytes; it's the only sane default
 because
 URL segments found in URLs on the internet are bytes.
 
 Yes, URL quoted strings should decode to bytes, though arguably it is
 reasonable to also use the very reasonable UTF-8 default that
 urllib.parse.quote/unquote uses.  So it's really just a question of
 names, should be quote_to_string or quote_to_bytes that name.  Which
 honestly... whatever.

After some careful consideration, I realize I'm only able to offer stop
energy regarding the WSGI-as-text proposal, so I'll bow out of any
maillist conversation about it for now.

- C





___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough

On Fri, 2010-07-16 at 17:11 -0500, Ian Bicking wrote:
 On Fri, Jul 16, 2010 at 5:08 PM, Chris McDonough chr...@plope.com
 wrote:
 On Fri, 2010-07-16 at 17:47 -0400, Tres Seaver wrote:
 
   In the past when we've gotten down to specifics, the only
 holdup has been
   SCRIPT_NAME/PATH_INFO, hence my suggestion to eliminate
 those.
 
  I think I favor PJE's suggestion:  let WSGI deal only in
 bytes.
 
 
 I'd prefer that WSGI 2 was defined in terms of a bytes with
 benefits
 type (Python 2's ``str`` with an optional encoding attribute
 as a hint
 for cast to unicode str) instead of Python 3-style bytes.
 
 But if I had to make the Hobson's choice between Python 3
 style bytes
 and Python 3 style str, I'd choose bytes.  If I then needed to
 write
 middleware or applications, I'd use WebOb or an equivalent
 library to
 enable a policy which converted those bytes to strings on my
 behalf.
 Making it easy to write raw middleware or applications
 without using
 such a library doesn't seem as compelling a goal as being able
 to easily
 write one which allowed me direct control at the raw level.
 
 What are the concrete problems you envision with text request headers,
 text (URL-quoted) path, and text response status and headers?

Documentation is the main reason.  For example, the documentation for
making sense of path_info segments in a WSGI that used unicodey-strings
would, as I understand it, read something like this:


The PATH_INFO environment variable is a string.  To decode it,

- First, split it on slashes::

segments = PATH_INFO.split('/')

- Then turn each segment into bytes::

bytes_segments = [ bytes(x, encoding='latin-1') for x in segments ]

- Then, de-encode each segment's urlencoded portions:

urldecoded_segments = [ urllib.unquote(x) for x in bytes_segments ]

- Then re-encode each urldecoded segment into the encoding expected
  by your application

app_segments = [ str(x, encoding='utf-8') for x in 
 urldecoded_segments ]

.. note:: We decode from latin-1 above because WSGI tunnels the bytes
representing the PATH_INFO by way of a string type which contains bytes
as characters.


That looks pretty apologetic to me, and to be honest, I'm not even sure
it will work reliably in the face of existing/legacy applications which
have emitted URLs that are not url-encoded properly if those old URLs
need to be supported.   http://bugs.python.org/issue8136 contains a
variation on this theme.

I'd much rather say be able to say:


The PATH_INFO environment variable is a ``bytes-with-benefits`` type.
To decode it:

- First, split it on slashes::

segments = PATH_INFO.split('/')

- Then, de-encode each segment's urlencoded portions:

urldecoded_segments = [ urllib.unquote(x) for x in segments ]

- Then re-encode each urldecoded segment into the encoding expected
  by your application

app_segments = [ str(x, encoding='utf-8') for x in 
 urldecoded_segments ]


Let me know if I'm missing something.

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough

On Sat, 2010-07-17 at 01:33 +0200, Armin Ronacher wrote:
 Hi,
 
 On 7/17/10 1:20 AM, Chris McDonough wrote:
   Let me know if I'm missing something.
 The only thing you miss is that the bytes type of Python 3 is badly 
 supported in the stdlib (not an issue if we reimplement everything in 
 our libraries, not an issue for me) and that the bytes type has no 
 string formattings which makes us do the encode/decode dance in our own 
 implementation so of the missing stdlib functions.

This is why the docs mention bytes with benefits instead (like the
Python 2 str type). The existence of such a type would be the result
of us lobbying for its inclusion into some future Python 3, or at least
the result of lobbying for a String ABC that would allow us to define
our own.

But.. yeah.  Stdlib support for bytes.  Dunno.   What I really don't
want to do is implement a WSGI spec in terms of Unicodey strings just
because the webby stuff in the stdlib cannot deal with bytes.  Those
stdlib implementations should be changed to deal with bytes-ish things
instead.  I actually think fixing the stdlib will end up being a driver
for the bytes with benefits type.  Supporting such a type in the
implementation of stdlib functions is clearly the right way to fix it in
lots of cases, because they will be able to deal with BwB and
Unicodey-strings in exactly the same way.

In the meantime, I think using bytes is the only sane thing to do in
some interim specification, because moving from a spec which is
bytes-oriented to a spec that is text-oriented now will leave us in the
embarrassing position of needing to create yet another bytes-oriented
spec later (as, well, I/O is bytes), when Python 3 matures and realizes
it needs such a hybrid type.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

[Web-SIG] http://wiki.python.org/moin/WebFrameworks

2009-11-26 Thread Chris McDonough

http://wiki.python.org/moin/WebFrameworks seems to be the place where folks are 
registering their respective web frameworks.


I'd like to move some of the frameworks which are currently in the various 
categories which haven't been active in a few years.  In particular, I'd like 
to move any framework which hasn't had a release since the beginning of 2008 
(arbitrary) into the Discontinued / Inactive framework category.  I'd be 
willing to do the work to make sure I wasn't moving one that actually *did* 
have releases past that but just hadn't updated the page.


Any dissent?

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] [Paste] WebOb API

2009-10-29 Thread Chris McDonough


Ian Bicking wrote:

Also I'm planning on introducing a BaseRequest (and *maybe*
BaseResponse) class, that removes some functionality.  Specifically
for Repoze they'd like to remove __getattr__ and __setattr__ (which
has some performance implications),


FTR, after thinking about it, I'm not even sure BaseRequest is necessary for 
this purpose.  This seems to work too (at least it gets previously visible 
setattr/getattr stuff out of the profiling info):


class Request(WebobRequest):
__setattr__ = object.__setattr__
__getattr__ = object.__getattribute__
__delattr__ = object.__delattr__



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Session events

2009-10-05 Thread Chris McDonough


This is supported at least here:

http://docs.repoze.org/session/usage.html#using-begin-and-end-subscribers



Alastair Bell Turner wrote:

Hi

I've been looking through the range of choices for Python web
[application] frameworks/libraries (Just to have all the bases
covered) for a new build project and standardisation of some small
utilities. There's one feature that I'm not finding and was just
wanting to check on before considering the joys of rolling my own: I'm
not finding any support for user session events, I'm particularly
interested in being able to register a handler on session expiry or
cleanup. I've mainly been looking at the lighter weight frameworks
since my requirement for the new build is mainly aggregate and list
operations, so the least suitable load for ORMs.

Have I missed the feature session event somewhere?

Thanks

Alastair Bell Turner
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-21 Thread Chris McDonough


OK, after some consideration, I think I'm sold.

Answering my own original question about why unicode seems to make sense as 
values in the WSGI environment even without consideration for Python 3 
compatibility:  *something* needs to do this translation.  Currently I 
personally rely on WebOb to do a lot of this translation.  I can't think of a 
good reason that implementations at the level of WebOb would each need to do 
this translation work; pushing the job into WSGI itself seems to make sense 
here.  This is particularly true for PATH_INFO and QUERY_STRING; these days 
it's foolish to assume these values will be entirely composed of low order 
characters, and thus being able to access them as bytes natively isn't very useful.


OTOH, I suspect the Python 3 stdlib is still broken if it requires native 
strings in various places (and prohibits the use of bytes).


James Bennett wrote:

On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough chr...@plope.com wrote:

WSGI is a fairly low-level protocol aimed at folks who need to interface a
server to the outside world.  The outside world (by its nature) talks bytes.
 I fear that any implied conversion of environment values and iterable
return values to Unicode will actually eventually make things harder than
they are now.  I realize that it would make middleware implementors lives
harder to need to deal in bytes.  However, at this point, I also believe
that middleware kinda should be hard.  We have way too much middleware that
shouldn't be middleware these days (some written by myself).


Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an
interface to HTTP should deal in bytes as well.

The problem, really is that despite being a very low-level interface,
WSGI has a tendency to leak up into much higher-level code, and (IMO)
authors of that high-level code really shouldn't have to waste their
time dealing with details of the underlying low-level gateway.

You've said you don't want to hear Python 3 as the reason, but it
provides some useful examples: in high-level code you'll commonly want
to be doing things like, say, comparing parts of the requested URL
path to known strings or patterns. And that high-level code will
almost certainly use strings, while WSGI, in theory, will be using
bytes. That's just a recipe for disaster; if WSGI mandates bytes, then
bytes will have to start infecting much higher-level code (since
Python 3 -- rightly -- doesn't let you be nearly as promiscuous about
mixing bytes and strings).

Once I'm at a point where I can use Python 3, I know I'll personally
be looking for some library which will normalize everything for me
before I interact with it, precisely to avoid this sort of leakage; if
WSGI itself would at least *allow* that normalization to happen at the
low level (mandating it is another discussion entirely) I'd feel much
happier about it going forward.




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-20 Thread Chris McDonough


I'll try to digest some of this, currently I'm pretty clueless.

Personally, I find it a bit hard to get excited about Python 3 as a web 
application deployment platform.  This is of course a personal judgment (I 
don't mean to slight Python 3) but at this point, I'll think I'll probably be 
writing software that targets 2.X exclusively for at least the next five years.


Given this point of view, it would be extremely helpful if someone could 
explain to people with the same outlook why we should want to deal with Unicode 
strings in any WSGI specification.


WSGI is a fairly low-level protocol aimed at folks who need to interface a 
server to the outside world.  The outside world (by its nature) talks bytes.  I 
fear that any implied conversion of environment values and iterable return 
values to Unicode will actually eventually make things harder than they are 
now.  I realize that it would make middleware implementors lives harder to need 
to deal in bytes.  However, at this point, I also believe that middleware kinda 
should be hard.  We have way too much middleware that shouldn't be middleware 
these days (some written by myself).


Anyway, for us slower (and maybe wrongly fearful) folks, could someone 
summarize the benefits of having a WSGI specification that requires Unicode. 
Bonus points for an explanation that does not boil down to it will be 
compatible with Python 3.


- C


Armin Ronacher wrote:

Hello everybody,

Thanks to Graham Dumpleton and Robert Brewer there is some serious
progress on WSGI currently.  I proposed a roadmap with some PEP changes
now that need some input.

Summary:

  WSGI 1.0   stays the same as PEP 0333 currently is
  WSGI 1.1   becomes what Ian and I added to PEP 0333
  WSGI 2.0   becomes a unicode powered version of WSGI 1.1
  WSGI 3.0   becomes WSGI 2.0 just without start_response

  WSGI 1.0 and 1.1 are byte based and nearly impossible to use on Python
  3 because of changes in the standard library that no longer work with
  a byte-only approach.


The PEPs themselves are here: http://bitbucket.org/ianb/wsgi-peps/
Neither the wording not the changes in there are anywhere near final.


Graham wrote down two questions he wants every major framework developer
to be answered.  These should guide the way to new WSGI standards:

1. Do we keep bytes everywhere forever in Python 2.X, or try to
   introduce unicode there at all to at least mirror what changes might
   be made to make WSGI workable in Python 3.X?

2. Do we skip WSGI 1.X completely for Python 3.X and go straight to
   WSGI 2.0 for Python 3.X?

I added a new question I think should be asked too:

3. Do we skip WSGI 2.0 as specified in the PEP and go straight to
   WSGI 3.0 and drop start_response?


The following things became pretty clear when playing around with
various specifications on Python 3:

-  Python 3 no longer implicitly converts between unicode and byte
   strings.  This covers comparisons, the regular expression engine,
   all string functions and many modules in the stdlib.

-  The Python 3 stdlib radically moved to unicode for non unicode things
   as well (the http servers, http clients, url handling etc.)

-  A byte only version of WSGI appears unrealistic on Python 3 because
   it would require server and middleware implementors to reimplement
   parts of the standard library to work on bytes again.

-  unicode support can be added for WSGI on both Python 2.x and Python
   3.x without removing functionality.  Browsers are already doing
   a similar encoding trick as proposed by Graham Dumpleton to handle
   URLs.

-  Python 2.x already accepts unicode strings for many things such as
   URL handling thanks to the fact that unicode and byte strings are
   surprisingly interchangeable.

-  cgi.FieldStorage and some other parts is now totally broken on
   Python 3 and should no longer be used in 3.0 and 3.1 because it
   reads the response body into memory.  This currently affects
   WebOb, Pylons and TurboGears.


I sent this mail to every major framework / WSGI implementor so that we
get input even if you're missing the discussion on web-sig.


Regards,
Armin
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] repoze.bfg web framework 1.0 released

2009-07-06 Thread Chris McDonough


On 7/5/09 10:37 PM, Graham Dumpleton wrote:

The first major release of the BFG web framework (aka repoze.bfg),
version 1.0, is available.  See http://bfg.repoze.org/ for general
information about repoze.bfg.

...

- WSGI-based deployment: PasteDeploy and mod_wsgi compatible.

...

- A comprehensive set of unit tests.  The repoze.bfg package contains
  11K lines of Python code.  8000 lines of that total line count is
  unit test code that tests the remaining 3000 lines.


A question about your testing if you have time. Is this done in a fake
WSGI hosting environment, ie., test harness, or is it able to be run
through WSGI servers such as Paste server, Apache/mod_wsgi, etc, in
some way?


The tests I mentioned in there are mostly unit tests; they don't test any 
particular system configuration functionally.  In particular, none of the tests 
actually invokes a request via a WSGI stack.


But we do use functional testing in projects that use the framework.  For 
example, we use Twill (created by Titus Brown) to make sure things don't break 
at the request/response level in this project:  http://karlproject.org.



Am curious from the point of view that standalone test suites for WSGI
itself to run against WSGI hosting mechanisms don't really exist, so
the test suite for BFG, with the presumption that it would exercise a
lot of WSGI functionality, might be a good regression test for WSGI
servers themselves.


I think maybe some ACID test WSGI application could be built, and then some 
set of functional HTTP-level tests could be run against that application to gain 
confidence in a WSGI app.  This is more or less what we do with Twill on that 
KARL project:  the developers use the Paste#http server, but we actually deploy 
to a mod_wsgi server.  We can (and do) run the Twill tests against both to get 
confidence that the app isn't going to fall over in production.


- C
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] repoze.bfg web framework 1.0 released

2009-07-06 Thread Chris McDonough


On 7/5/09 11:44 PM, Randy Syring wrote:

Chris,

Sounds interesting.  Question: Does it support some
kind of module/plugin architecture that will allow me to develop plug-
in functionality across projects?  What would be called in
Django an app.

For example, I would like to have a news, blog, and calendar
module that I can plug into different applications.  The goal is to
have everything for the module contained in one subdirectory or package
including
any configuration, routing, templates, controllers, model, etc.  So,
something like this:

/modules/news/...
/modules/calendar/...
/modules/blog/...

Or:

packages/
MyProj
NewsComponent
CalendarComponent
BlogComponent



I'm not sure if I can do this topic justice here (many have fallen on the sword 
when approaching it before), but I'll try.


Plugin apps is maybe less a feature of BFG than the stuff that BFG is built on 
top of.  Like Zope, BFG makes use of the Zope Component Architecture under the 
hood.  Unlike Zope, BFG tends to hide the ZCA (conceptually and API-wise) from 
developers, because the ZCA introduces concepts like adapters, interfaces, 
and utilities.  Direct exposure to these concepts in user-visible code evokes 
suspicion in people who just don't have the problems they try to solve.  The 
problems that the ZCA tries to solve usually revolve around code testability and 
reusability, and most people just don't care that much about these things.


So BFG is more like Pylons or Django in this respect: it provides helper APIs 
and places to hang your code so that you can build a single-purpose application 
reasonably easily without making you think in terms of building anything 
reusable.  The final application usually happens to be overrideable and 
extensible, but that's just a byproduct of using BFG, and doesn't really have 
very much to do with building a system out of plugins.


In the meantime, the Zope Component Architecture is a fantastic system on which 
to build a *framework* (as opposed to an application).  This is why BFG is built 
on top of it.  If you are willing to use the ZCA conceptually and API-wise *in 
your application code*, it becomes straightforward to build reusable 
applications like you mention.


So the answer to your original question is probably no.  BFG itself isn't a 
system which allows you to slot arbitrary components into place and have them 
show up somewhere.  It's instead a system (like Zope) in which you can build 
such a thing.  In fact, many of the applications that we (my company, 
Agendaless) build are these kinds of applications, where we tend to want to 
reuse a single application component across many customers or projects.


The trick is this: when you build pluggable applications, there's presumably 
something you're going to want to plug these applications into.  I *think* this 
the piece that most people are after when they talk about pluggable 
applications; they actually don't care too much about the applications 
themselves (because they'll build them themselves), it's the higher-level thing 
that gets plugged into that is of primary interest.  For better or worse, 
systems like Plone, Drupal, and Joomla are examples of such an application 
framework.  These systems allow you to build small pieces of functionality that 
drop in to some larger system.


We've done lots of Zope and Plone work, and we know the downsides of the plug 
this bit into the larger framework pattern pretty well.  We've found that it's 
useful to have the tools at hand to build miniature versions of such large 
frameworks on hand, so we can quickly come up with a custom solution to some 
problem without fighting the framework (any particular framework) so much. 
BFG plus direct use of the ZCA in application code tends to let us avoid using 
the larger frameworks in favor of rolling our own (more focused, simpler) 
frameworks.


Unfortunately, I don't have any simple example application code to show with 
respect to this pattern, because anything I could show here would be too trivial 
to be useful.  More unfortunately, anything I can point you to that we've built 
using this pattern will probably be too large to understand in any reasonable 
amount of time (e.g. http://karlproject.org).


This has always been the historical problem with trying to promote use of the 
ZCA for application code: until you work on a larger project that uses it 
right, it's just too abstract.  So by the time you actually need it, it's too 
late and you've already invented your own mechanisms to do similar indirections. 
 For those reasons, I think it would be a useful exercise to build some very 
simple system that took app plugins and just exposed them in some very 
concrete way to end users, even if it meant losing some presentation 
flexibility.  Such a system could be created in any web framework, but using the 
ZCA inside the web framework for such a task is a no-brainer to me.


Anyway, even this explanation is too

[Web-SIG] repoze.bfg web framework 1.0 released

2009-07-05 Thread Chris McDonough


Summary
---

The first major release of the BFG web framework (aka repoze.bfg),
version 1.0, is available.  See http://bfg.repoze.org/ for general
information about repoze.bfg.

Details
---

BFG is a Python web framework based on WSGI.  It is inspired by Zope,
Pylons, and Django.  It makes use of a number of Zope technologies
under the hood.

BFG is developed as part of the more general Repoze project
(http://repoze.org).  It is released under the BSD-like license
available from http://repoze.org/license.html .

BFG version 1.0 represents one year of development effort.  The first
release of BFG, version 0.1, was made in July of 2008.  Since then,
roughly 80 pre-1.0 releases have been made.  None of these pre-1.0
releases explicitly promised any backwards compatibility with any
earlier release.

Version 1.0, however, marks the first point at which the repoze.bfg
API has been frozen.  Future releases in the 1.X line guarantee
API-level backward compatibility with 1.0.  A backwards
incompatibility with 1.0 at the API level in any future 1.X version
will be considered a bug.

More Details


BFG contains moderate, incremental improvements to patterns found in
earlier-generation web frameworks.  It tries to make real-world web
application development and deployment more fun, more predictable, and
more productive.  To this end, BFG has the the following features:

- WSGI-based deployment: PasteDeploy and mod_wsgi compatible.

- Runs under Python 2.4, 2.5, and 2.6.

- Runs on UNIX, Windows, and Google App Engine.

- Full documentation coverage: no feature or API is undocumented.

- A comprehensive set of unit tests.  The repoze.bfg package contains
  11K lines of Python code.  8000 lines of that total line count is
  unit test code that tests the remaining 3000 lines.

- Sparse resource utilization: BFG has a small memory footprint and
  doesn't waste any CPU cycles.

- Doesn't have an unreasonable set of dependencies: easy_install
  -ing repoze.bfg over broadband takes less than a minute.

- Quick startup: a typical BFG application starts up in about a
  second.

- Offers extremely fast XML/HTML and text templating via Chameleon
  (http://chameleon.repoze.org/).

- Persistence-agnostic: use SQLAlchemy, raw SQL, ZODB, CouchDB,
  filesystem files, LDAP, or anything else which suits a particular
  application's needs.

- Provides a variety of starter project templates.  Each template
  makes it possible to quickly start developing a BFG application
  using a particular application stack.

- Offers URL-to-code mapping like Django or Pylons' *URL routing* or
  like Zope's *graph traversal*, or allows a combination of both
  routing and traversal.  This helps make it feel familiar to both
  Zope and Pylons developers.

- Offers debugging modes for common development error conditions (for
  example, when a view cannot be found, or when authorization is being
  inappropriately granted or denied).

- Allows developers to organize their code however they see fit; the
  framework is not opinionated about code structure.

- Allows developers to write code that is easily unit-testable.
  Avoids using thread local data structures which hamper testability.
  Provides helper APIs which make it easy to mock framework components
  such as templates and views.

- Provides an optional declarative context-sensitive authorization
  system.  This system prevents or allows the execution of code based
  on a comparison of credentials possessed by the requestor against
  ACL information stored by a BFG application.

- Behavior of an an application built using BFG can be extended or
  overridden arbitrarily by a third-party developer without any
  modification to the original application's source code.  This makes
  BFG a good choice for building frameworks and other extensible
  applications.

- Zope and Plone developers will be comfortable with the terminology
  and concepts used by BFG; they are almost all Zope-derived.

Excruciating Details


Quick installation:

  easy_install -i http://dist.repoze.org/bfg/current repoze.bfg

General support and information:

  http://bfg.repoze.org

Tutorials

  http://docs.repoze.org/bfg/current/#tutorials

Sample Applications

  http://docs.repoze.org/bfg/current/#sample-applications

Detailed narrative and API documentation:

  http://docs.repoze.org/bfg/current

Bug tracker:

  http://bfg.repoze.org/trac

Maillist:

  http://lists.repoze.org/listinfo/repoze-dev

IRC support:

  irc://irc.freenode.net#repoze

repoze.bfg is developed primarily by Agendaless Consulting
(http://agendaless.com) and a team of contributors.

Special thanks to these people, without whom this release would not
have been possible:

Malthe Borch, Carlos de la Guardia, Chris Rossi, Shane Hathaway, Tom
Moroz, Yalan Teng, Jason Lantz, Todd Koym, Jessica Geist, Hanno
Schlichting, Reed O'Brien, Sebastien Douche, Ian Bicking, Jim Fulton,
Martijn Faassen, Ben Bangert, Fernando Correa

Re: [Web-SIG] Prototype of wsgi.input.readline().

2008-01-30 Thread Chris McDonough

Graham Dumpleton wrote:
 As I think we all know, no one implements readline() for wsgi.input as
 defined in the WSGI specification. The reason for this is that stuff
 like cgi.FieldStorage would refuse to work and would just generate an
 exception. This is because cgi.FieldStorage expects to pass an
 argument to readline().

I haven't been keeping up on the issues this has caused wrt WSGI, but note that 
the reason that cgi.FieldStorage passes a size argument to readline is in order 
to prevent memory exhaustion when reading files that don't have any linebreaks 
(denial of service).  See http://bugs.python.org/issue1112549 .

 
 So, although this is linked in the issues list for possible amendments
 to WSGI specification, there hasn't that I recall been a discussion on
 how readline() would be defined in any amendment or future version.
 
 In particular, would the specification be changed to either:
 
 1. readline(size) where size argument is mandatory, or:
 
 2. readline(size=-1) where size argument is optional.
 
 If the size argument is made mandatory, then it would parallel how
 read() function is defined, but this in itself would mean
 cgi.FieldStorage would break.
 
 This is because cgi.FieldStorage actually calls readline() with no
 argument as well as an argument in different places in the code.

cgi.FieldStorage doesn't call readline() without an argument. 
cgi.parse_multipart does, but this function is not used by cgi.FieldStorage.  I 
don't know if this changes anything.

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Prototype of wsgi.input.readline().

2008-01-30 Thread Chris McDonough

Graham Dumpleton wrote:
 

 If the size argument is made mandatory, then it would parallel how
 read() function is defined, but this in itself would mean
 cgi.FieldStorage would break.

 This is because cgi.FieldStorage actually calls readline() with no
 argument as well as an argument in different places in the code.
 cgi.FieldStorage doesn't call readline() without an argument.
 cgi.parse_multipart does, but this function is not used by cgi.FieldStorage. 
  I
 don't know if this changes anything.
 
 Not really, I should have said 'cgi' module as a whole rather than
 specifically cgi.FieldStorage. Given that people might be using
 cgi.parse_multipart in standard CGI, there would probably still be an
 expectation that it worked for WSGI. We can't really say that you can
 use cgi.FieldStorage but not cgi.parse_multipart. People will just
 expect all the normal tools people would use for this to work.

Personally, I think parse_multipart should go away.  It's not suitable for 
anything but toy usage.

If people use it, and they expose their site to the world, arbitrary anonymous 
visitors can cause their Python's process size to grow to arbitrarily.  I don't 
think any existing well-known framework uses it, for this very reason.

If it can't go away, and there's a problem due to the non-parity between 
parse_multipart's use and FieldStorage's use, I suspect the right answer is to 
change cgi.parse_multipart to pass in a size value for readline too.  I 
probably 
should have done that when I made the patch. :-(

- C
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Prototype of wsgi.input.readline().

2008-01-30 Thread Chris McDonough

Graham Dumpleton wrote:
 On 31/01/2008, Chris McDonough [EMAIL PROTECTED] wrote:
 Graham Dumpleton wrote:
 If the size argument is made mandatory, then it would parallel how
 read() function is defined, but this in itself would mean
 cgi.FieldStorage would break.

 This is because cgi.FieldStorage actually calls readline() with no
 argument as well as an argument in different places in the code.
 cgi.FieldStorage doesn't call readline() without an argument.
 cgi.parse_multipart does, but this function is not used by 
 cgi.FieldStorage.  I
 don't know if this changes anything.
 Not really, I should have said 'cgi' module as a whole rather than
 specifically cgi.FieldStorage. Given that people might be using
 cgi.parse_multipart in standard CGI, there would probably still be an
 expectation that it worked for WSGI. We can't really say that you can
 use cgi.FieldStorage but not cgi.parse_multipart. People will just
 expect all the normal tools people would use for this to work.
 Personally, I think parse_multipart should go away.  It's not suitable for
 anything but toy usage.
 
 Not necessarily. Someone may see it as a trade off. The code itself says:
 
 This is easy to use but not
 much good if you are expecting megabytes to be uploaded -- in that case,
 use the FieldStorage class instead which is much more flexible.
 
 So comment implies it is easier to use and so some may think it is
 simpler for what they are doing if they are only dealing with small
 requests.
 
 Of course, it would probably be prudent if you know your requests are
 always going to be small to use LimitRequestBody in Apache, or a
 specific check on content length if handled in Python code, to block
 someone sending over sized requests intentionally to try and break
 things. Provided you did this, may be quite reasonable to use it in
 specific circumstances.

Indeed.  But then again, I doubt the casual user would be able to make this 
judgment and take the necessary precautions.  This kind of user is likely the 
same class of user for whom CGI.FieldStorage is too hard (which it really 
isn't).

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] HEAD requests, WSGI gateways, and middleware

2008-01-24 Thread Chris McDonough

I have applications that do detect the difference between a GET and a HEAD 
(they 
do slightly less work if the request is a HEAD request), so I suspect this is 
not a totally reasonable thing to add to the spec.  Maybe instead the 
middleware 
that does what you're describing should be changed instead to deal with HEAD 
requests.

In general, I don't think is (or should be) any guarantee that an arbitrary 
middleware stack will work with an arbitrary application.  Although that would 
be nice in theory, I suspect it would require a very complex protocol (more 
complex than what WSGI requires now).

- C

Brian Smith wrote:
 My application correctly responds to HEAD requests as-is. However, it doesn't 
 work with middleware that sets headers based on the content of the response 
 body.
 
 For example, a gateway or middleware that sets ETag based on an checksum, 
 Content-Encoding, Content-Length and/or Content-MD5 will all result in wrong 
 results by default. Right now, my applications assume that any such gateway 
 or the first such middleware will change environ[REQUEST_METHOD] from 
 HEAD to GET before the application is invoked, and discard the response 
 body that the application generates. 
 
 However, many gateways and middleware do not do this, and PEP 333 doesn't 
 have anything to say about it. As a result, a 100% WSGI 1.0-compliant 
 application is not portable between gateways.
 
 I suggest that a revision of PEP 333 should require the following behavior:
 
 1. WSGI gateways must always set environ[REQUEST_METHOD] to GET for HEAD 
 requests. Middleware and applications will not be able to detect the 
 difference between GET and HEAD requests.
 
 2. For a HEAD request, A WSGI gateway must not iterate through the response 
 iterable, but it must call the response iterable's close() method, if any. It 
 must not send any output that was written via start_response(...).write() 
 either. Consequently, WSGI applications must work correctly, and must not 
 leak resources, when their output is not iterated; an application should not 
 signal or log an error if the iterable's close() method is invoked without 
 any iteration taking place.
 
 Please add this issue to http://wsgi.org/wsgi/WSGI_2.0.
 
 Regards,
 Brian
 
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-04 Thread Chris McDonough


On Oct 4, 2007, at 11:55 AM, Phillip J. Eby wrote:

 At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote:
 Your are making a critical decision here.
 You are lowering the level of WSGI to match the level of average WSGI
 middlewares programmers.

 No, we're just getting rid of legacy cruft that's hard to support
 correctly.  There's a big difference.

Getting the start_response dance down and understanding how it plays  
with middleware is *hard*.  Even if we called it something other than  
WSGI 2.0 (which I don't think we should, because it really is an  
evolution), returning the three-tuple is the right thing to do.

- C




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Web Site Process Bus

2007-06-26 Thread Chris McDonough

On Jun 26, 2007, at 1:04 AM, Graham Dumpleton wrote:
 In Apache changing the certificates would need a complete restart of
 everything. Because the  child processes aren't privileged they would
 not be able to trigger the main server to do so. This actually gets to
 one of my reservations about some of the stuff being discussed. That
 is, that the WSGI applications should even have any ability to control
 the underlying web server. In a shared web hosting environment using
 Apache, allowing such control is not practical as you don't want
 arbitrary user doing things to the server. If you are running Apache
 as a dedicated server for a single application that is a different
 matter however. Thus some aspects of what can be done by via the bus
 would  have to be controllable dependent on the environment in which
 one is running.

 At least with Apache, even initiating this sort of stuff from inside
 of a WSGI application may not make a great deal of sense even then. It
 would be far easier and preferable in Apache to use a suexec CGI
 script to accept the upload of the SSL certificate and then trigger a
 restart of Apache. So in the end the bus concept may be great for pure
 Python system, but not so sure about a complicated mixed code system
 like Apache, especially where there may be better ways of handling it
 through other features of Apache.

There are also non-webbish processes like postgres, mysql, etc. that  
need to be treated as part of the application.

I handle this currently by running all of the processes related to a  
specific project under a process controller (which happens to be  
implemented in Python, but that's besides the point, see http:// 
www.plope.com/software/supervisor2/).  The process controller is  
responsible for execing the child processes upon its own startup.  It  
is also responsible for restarting children if they die, capturing  
their output (if any), and allowing sufficiently privileged users to  
start and stop each one independently.  The only promise a subprocess  
must make to be managed is that it must be possible to start the  
process in the foreground (not under its down daemon manager).

If a process bus is implemented I suspect it should be implemented  
at this kind of level.  Actions could be registered for a specific  
subprocess types to send some input to a pipe file descriptor, send a  
signal to the process, etc.  It would also be possible to create some  
sort of dependency map between processes in a configuration, that  
relate the actions of one process to another (restart process A if  
process B is restarted, send a signal S to process C if signal T is  
sent to process D, etc).

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Web Site Process Bus

2007-06-26 Thread Chris McDonough

On Jun 26, 2007, at 5:07 PM, Robert Brewer wrote:
 I think I'm mostly confused by the name process bus because it
 seems like the primary use case for something like this is where all
 of the applications share the same process space

 I don't see why it should be limited by that. The primary use case is
 anywhere site components and application components are interacting,
 that could benefit from a shared understanding (and control) of the
 state of the site. To me, that requires a common set of messages, but
 the transport mechanism for those messages should be flexible so that
 it's useful in both multithread and multiprocess architectures.

Thank you.  I see.  This is a little too abstract for me to get my  
brain around, but I'll continue listening and maybe I'll get  
religion. ;-)

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] html dom like javascript ?

2006-08-17 Thread Chris McDonough

You probably want elementtree (http://effbot.org/zone/element-index.htm).


 Thanks for the rapid reply.  I am familiar with a number of these and
 have searched the web documentation but for the most part these appear
 to be parsers or things like:

 http://www.acooke.org/andrew/writing/python-xml.html#code

 That are xml centric and not html related.   I'm looking for something
 that is more html specific that contains all the options for any html
 widtget, like a form element with all of its options like style, css,
 and so forth.  In other words I dont want to have to write my own xml
 file with all the html tags and options.



 Jean-Paul Calderone wrote:
 On Thu, 17 Aug 2006 10:10:47 -0400, seth [EMAIL PROTECTED] wrote:
 Is there a python library which is analogous to javascript for creating
 html/xhtml documents? e.g.:

 hidden = document.createElement(input)
 hidden.setAttribute(type, hidden)
 hidden.setAttribute(name, active_flag_hidden_ + ctl)
 if( dirtyArray[ctl].checked == true) {
hidden.setAttribute(value, 'N')
 } else {
hidden.setAttribute(value, 'Y')
 }
 document.forms['listForm'].appendChild(hidden)

 At least fifty.  The DOM API is heavily standardized with hundreds of
 implementations in dozens of languages.

 http://python.org/doc/lib/module-xml.dom.html

 Jean-Paul


 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe:
 http://mail.python.org/mailman/options/web-sig/chrism%40plope.com



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI in standard library

2006-02-12 Thread Chris McDonough

On Feb 12, 2006, at 6:39 AM, Alan Kennedy wrote:
 So, I still think that only basic servers educational/playpen servers
 should go in the standard library, with an indication that the user
 should pick an openly server from outside the distro if they  
 require to
 do serious server work.

I agree 100%.


 Maybe if there were no production-ready servers in the standard
 library, there would be no need for a Python Security Response Team.

As an example, it's currently possible to perform denial of service  
on any framework/server that uses the cgi.FieldStorage module.  See  
http://sourceforge.net/tracker/? 
func=detailaid=1112549group_id=5470atid=105470
  .  That module probably doesn't belong in the stdlib in the first  
place, but it's in there, and now things depend on it.

In the meantime, this patch *really* should have been applied by now  
but hasn't been.  If anyone has checkin access, or can help me poke  
the appropriate person, it would help... this was reported to the SRT  
at the time.

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] My original template API proposal

2006-02-06 Thread Chris McDonough

Although I've been trying to follow this thread, I'm finding it  
difficult to get a handle on what is meant to *call* the template API  
(e.g. what typically calls render in Ian's ITemplatePlugin  
interface at http://svn.pythonpaste.org/home/ianb/templateapi/ 
interface.py)?  Is the framework meant to call render?

Sorry for the remedial question ;-)

- C


On Feb 5, 2006, at 5:19 PM, Phillip J. Eby wrote:

 At 02:46 PM 2/5/2006 -0600, Ian Bicking wrote:
 Ian Bicking wrote:
   def render(template_instance, vars, format=html,  
 fragment=False):

 Here I can magically turn this into a WEB templating spec:

 def render(template_instance, vars, format=html, fragment=False,
 wsgi_environ=None, set_header_callback=None)

 wsgi_environ is the environ dictionary if this is being called in  
 a WSGI
 context.  set_header_callback can be called like
 set_header_callback(header_name, header_value) to write such a  
 header to
 the response.  Frameworks may or may not allow for setting  
 headers.  If
 they don't allow for it, they shouldn't provide that callback (thus
 headers will not be mysteriously thrown away -- instead they will be
 rejected immediately).  [Should set_header_callback('Status', '404  
 Not
 Found') be used, or a separate callback, or...?]

 This follows what all server pages templates I know of do.  That  
 is,
 they do not have special syntax related to any metadata (i.e.,  
 headers)
 or even any special syntax related to web requests.  Instead the web
 request is represented through some set of variables available in the
 template.

 Yes, but different template systems offer different APIs based on  
 it; the
 idea of using WSGI here was to make it possible for them to offer  
 their
 *own*, native APIs under this spec, not to force the use of the host
 framework's API.

 The only thing that's missing from your proposal is streaming  
 control or
 large file support.  I'll agree that it's an edge use case, but it  
 seems to
 me just as easy to just offer a plain WSGI interface and not have to
 document a bunch of differences and limitations.  OTOH, if this is  
 what it
 takes to get consensus, so be it.

 The additional advantage to using plain ol' WSGI as the calling  
 interface,
 however, is that it also lets you embed *anything* as a template,  
 including
 whole applications if they provide a template engine whose syntax is
 actually the application's configuration.

 Anyway, the only differences I'm aware of between what you're  
 proposing and
 what I'm proposing are:

 1. Syntax sugar (each proposal sweetens different use cases)
 2. Feature restrictions (yours takes away streaming)
 3. What's optional (you consider WSGI optional, I want strings to  
 be optional)

 It would be better, I think, to address further discussion to  
 addressing
 the actual points of difference.

 Regarding #2, I'm willing to compromise to get consensus.   
 Regarding #3,
 I'd be willing to compromise by making *both* optional, with clearly
 defined variations of the spec so that plugins and frameworks that  
 support
 each are clearly distinguishable.  This would also mean that we'd  
 both be
 able to get the syntaxes we want under #1.

 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism% 
 40plope.com


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Standardized template API

2006-02-01 Thread Chris McDonough

One specific concern about the returning the published object for  
publisher-based frameworks is that often the published object has  
references to other objects that might not make sense in the context  
of the thread handling the rendering of the template.  For example,  
if you're using a thread pool behind a Twisted server, and the thing  
doing the rendering is in the main thread, methods hanging off of  
the published object might try to make use of thread-local storage,  
which would fail.  Zope 3 uses thread-local storage for request  
objects, IIRC.

This might be a nonissue, because I'm a little fuzzy on which  
component(s) actually do(es) the rendering of the template in the  
models being proposed.  But the amount of fuzziness I have about  
what's trying to be specified here makes me wonder if there aren't  
better things to go specify.


 As I mentioned in my counter-proposal, there should probably be a  
 key like
 'wti.source' to contain either the object to be published (for
 publisher-oriented frameworks) or a dictionary of variables (for
 controller-oriented frameworks).  I originally called it published
 object, but that's biased towards publisher frameworks so perhaps  
 a more
 neutral name like 'source' or 'data' would be more appropriate.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] transaction progress with cgi.FieldStorage

2005-12-30 Thread Chris McDonough

 An aside on cgi.FieldStorage itself. It reads data using readline
 instead of reading in blocks of limited size. doing this I think means
 a file with very long lines, 20MB, 100MB, ... could cause excessive
 memory consumption.

This was reported and solved a long time ago (but not yet fixed in  
any Python distro):

https://sourceforge.net/tracker/? 
func=detailaid=1112549group_id=5470atid=105470

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI deployment use case

2005-07-26 Thread Chris McDonough

On Tue, 2005-07-26 at 01:18 -0500, Ian Bicking wrote:
 Well, the stack is really just an example, meant to be more realistic 
 than sample1 and sample2.  I actually think it's a very reasonable 
 example, but that's not really the point.  Presuming this stack, how 
 would you configure it?

I typically roll out software to clients using a build mechanism (I
happens to use pymake at http://www.plope.com/software/pymake/ but
anything dependency-based works).

I write generic build scripts for all of the software components.  For
example, I might write makefiles that check out and build python,
openldap, mysql and so on (each into a non-system location).  I leave
a bit of room for customization in their build definitions that I can
override from within a profile.  A profile is a set of customized
software builds for a specific purpose.

I might have, maybe, 3 different profiles for each customer where the
profile usually works out to be tied to machine function (load balancer,
app server, database server).  I mantain these build scripts and the
profiles in CVS for each customer.  I never install anything by hand, I
always change the buildout and rerun it if I need to get something set
up.

This usually works out pretty well because to roll out a new major
version of software, I just rerun the build scripts for a particular
profile and move the data over.  Usually the only thing that needs to
change frequently are a few bits of software that are checked out of
version control, so doing cvs up on those bits typically gets me where
I need to be unless it's a major revision.

So in this case, I'd likely write a build that either built Apache from
source or at least created an httpd-includes file meant to be
referenced from within the system Apache config file with the proper
stuff in it given the profile's purpose.  The build would also download
and install Python, it would get the the proper eggs and/or Python
software and the database, and so forth.  All the configuration would be
done via the profile which is in version control.

I don't know if this kind of thing works for everybody, but it has
worked well for me so far.  I do this all the time, and I have a good
library of buildout scripts already so it's less painful for me than it
might be for someone who is starting from scratch.  That said, it is
time-consuming and imperfect... upgrades are the most painful.  New
installs are simple, though.

So, anyway, the short answer is I write a script to do the config for
me so I can repeat it on demand.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-25 Thread Chris McDonough

Thanks...

I'm still confused about high level requirements so please try to be
patient with me as I try get back on track.

These are the requirements as I understand them:

1.  We want to be able to distribute WSGI applications and middleware
(presumably in a format supported by setuptools).

3.  We want to be able to configure a WSGI application in order
to create an application instance.

2.  We want a way to combine configured instances of those
applications into pipelines and start an instance of a pipeline.

Are these requirements the ones being discussed?  If so, which of the
config file formats we've been discussing matches which requirement?

Thanks,

- C

On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote:
 At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote:
 Sorry, I think I may have lost track of where we were going wrt the
 deployment spec.  Specifically, I don't know how we got to using eggs
 (which I'd really like to, BTW, they're awesome conceptually!) from
 where we were in the discussion about configuring a WSGI pipeline.  What
 is a feature?  What is an import map? Entry point?  Should I just
 get more familiar with eggs to understand what's being discussed here or
 did I miss a few posts?
 
 I suggest this post as the shortest architectural introduction to the whole 
 egg thang:
 
  http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html
 
 It explains pretty much all of the terminology I'm currently using, except 
 for the new terms invented today...
 
 Entry points are a new concept, invented today by Ian and myself.  Ian 
 proposed having a mapping file (which I dubbed an import map) included in 
 an egg's metadata, and then referring to named entries from a pipeline 
 descriptor, so that you don't have to know or care about the exact name to 
 import.  The application or middleware factory name would be looked up in 
 the egg's import map in order to find the actual factory object.
 
 I took Ian's proposal and did two things:
 
 1) Generalized the idea to a concept of entry points.  An entry point is 
 a name that corresponds to an import specification, and an optional list of 
 extras (see terminology link above) that the entry point may 
 require.  Entry point names exist in a namespace called an entry point 
 group, and I implied that the WSGI deployment spec would define two such 
 groups: wsgi.applications and wsgi.middleware, but a vast number of other 
 possibilities for entry points and groups exist.  In fact, I went ahead and 
 implemented them in setuptools today, and realized I could use them to 
 register setup commands with setuptools, making it extensible by any 
 project that registers entry points in a 'distutils.commands' group.
 
 2) I then proposed that we extend our deployment descriptor (.wsgi file) 
 syntax so that you can do things like:
 
  [foo from SomeProject]
  # configuration here
 
 What this does is tell the WSGI deployment API to look up the foo entry 
 point in either the wsgi.middleware or wsgi.applications entry point group 
 for the named project, according to whether it's the last item in the .wsgi 
 file.  It then invokes the factory as before, with the configuration values 
 as keyword arguments.
 
 This proposal is of course an *extension*; it should still be possible to 
 use regular dotted names as section headings, if you haven't yet drunk the 
 setuptools kool-aid.  But, it makes for interesting possibilities because 
 we could now have a tool that reads a WSGI deployment descriptor and runs 
 easy_install to find and download the right projects.  So, you could 
 potentially just write up a descriptor that lists what you want and the 
 server could install it, although I think I personally would want to run a 
 tool explicitly; maybe I'll eventually add a --wsgi=FILENAME option to 
 EasyInstall that would tell it to find out what to install from a WSGI 
 deployment descriptor.
 
 That would actually be pretty cool, when you realize it means that all you 
 have to do to get an app deployed across a bunch of web servers is to copy 
 the deployment descriptor and tell 'em to install stuff.  You can always 
 create an NFS-mounted cache directory where you put pre-built eggs, and 
 EasyInstall would just fetch and extract them in that case.
 
 Whew.  Almost makes me wish I was back in my web apps shop, where this kind 
 of thing would've been *really* useful to have.
 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-25 Thread Chris McDonough

Actually, let me give this a shot.

We package up an egg called helloworld.egg.  It happens to contain
something that can be used as a WSGI component.  Let's say it's a WSGI
application that always returns 'Hello World'.  And let's say it also
contains middleware that lowercases anything that passes through before
it's returned.

The implementations of these components could be as follows:

class HelloWorld:
def __init__(self, app, **kw):
pass # nothing to configure

def __call__(self, environ, start_response):
start_response('200 OK', [])
return ['Hello World']

class Lowercaser:
def __init__(self, app, **kw):
self.app = app
# nothing else to configure

def __call__(self, environ, start_response):
for chunk in self.app(environ, start_response):
yield chunk.lower()

An import map would ship inside of the egg-info dir:

[wsgi.app_factories]
helloworld = helloworld:HelloWorld
lowercaser = helloworld:Lowercaser

So we install the egg and this does nothing except allow it to be used
from within Python.
  
But when we create a deployment descriptor like so in a text editor:

[helloworld from helloworld]

[lowercaser from helloworld]

... and run some starter script that parses that as a pipeline,
creates the two instances, wires them together, and we get a running
pipeline?

Am I on track?

OK, back to Battlestar Galactica ;-)



On Mon, 2005-07-25 at 02:40 -0400, Chris McDonough wrote:
 BTW, a simple example that includes proposed solutions for all of these
 requirements would go a long way towards helping me (and maybe others)
 understand how all the pieces fit together.  Maybe something like:
 
 - Define two simple WSGI components:  a WSGI middleware and a WSGI
   application.
 
 - Describe how to package each as an indpendent egg.
 
 - Describe how to configure an instance of the application.
 
 - Describe how to configure an instance of the middleware
 
 - Describe how to string them together into a pipeline.
 
 - C
 
 
 On Mon, 2005-07-25 at 02:33 -0400, Chris McDonough wrote:
  Thanks...
  
  I'm still confused about high level requirements so please try to be
  patient with me as I try get back on track.
  
  These are the requirements as I understand them:
  
  1.  We want to be able to distribute WSGI applications and middleware
  (presumably in a format supported by setuptools).
  
  3.  We want to be able to configure a WSGI application in order
  to create an application instance.
  
  2.  We want a way to combine configured instances of those
  applications into pipelines and start an instance of a pipeline.
  
  Are these requirements the ones being discussed?  If so, which of the
  config file formats we've been discussing matches which requirement?
  
  Thanks,
  
  - C
  
  On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote:
   At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote:
   Sorry, I think I may have lost track of where we were going wrt the
   deployment spec.  Specifically, I don't know how we got to using eggs
   (which I'd really like to, BTW, they're awesome conceptually!) from
   where we were in the discussion about configuring a WSGI pipeline.  What
   is a feature?  What is an import map? Entry point?  Should I just
   get more familiar with eggs to understand what's being discussed here or
   did I miss a few posts?
   
   I suggest this post as the shortest architectural introduction to the 
   whole 
   egg thang:
   
http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html
   
   It explains pretty much all of the terminology I'm currently using, 
   except 
   for the new terms invented today...
   
   Entry points are a new concept, invented today by Ian and myself.  Ian 
   proposed having a mapping file (which I dubbed an import map) included 
   in 
   an egg's metadata, and then referring to named entries from a pipeline 
   descriptor, so that you don't have to know or care about the exact name 
   to 
   import.  The application or middleware factory name would be looked up in 
   the egg's import map in order to find the actual factory object.
   
   I took Ian's proposal and did two things:
   
   1) Generalized the idea to a concept of entry points.  An entry point 
   is 
   a name that corresponds to an import specification, and an optional list 
   of 
   extras (see terminology link above) that the entry point may 
   require.  Entry point names exist in a namespace called an entry point 
   group, and I implied that the WSGI deployment spec would define two such 
   groups: wsgi.applications and wsgi.middleware, but a vast number of other 
   possibilities for entry points and groups exist.  In fact, I went ahead 
   and 
   implemented them in setuptools today, and realized I could use them to 
   register setup commands with setuptools, making it extensible by any 
   project that registers entry points in a 'distutils.commands' group.
   
   2

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-25 Thread Chris McDonough

Great.  Given that, I've created the beginnings of a more formal
specification:

WSGI Deployment Specification
-

  I use the term WSGI component in here as shorthand to indicate all
  types of WSGI implementations (application, middleware).

  The primary deployment concern is to create a way to specify the
  configuration of an instance of a WSGI component within a
  declarative configuration file.  A secondary deployment concern is
  to create a way to wire up components together into a specific
  deployable pipeline.

Pipeline Descriptors


  Pipeline descriptors are file representations of a particular WSGI
  pipeline.  They include enough information to configure,
  instantiate, and wire together WSGI apps and middleware components
  into one pipeline for use by a WSGI server.  Installation of the
  software which composes those components is handled separately.

  In order to define a pipeline, we use a .ini-format configuration
  file conventionally named 'something.wsgi'.  This file may
  optionally be marked as executable and associated with a simple UNIX
  interpreter via a leading hash-bang line to allow servers which
  employ stdin and stdout streams (ala CGI) to run the pipeline
  directly without any intermediation.  For example, a deployment
  descriptor named 'myapplication.wsgi' might be composed of the
  following text::

#!/usr/bin/runwsgi

[mypackage.mymodule.factory1]
quux = arbitraryvalue
eekx = arbitraryvalue

[mypackage.mymodule.factory2]
foo = arbitraryvalue
bar = arbitraryvalue

  Section names are Python-dotted-path names (or setuptools entry
  point names described in a later section) which represent
  factories.  Key-value pairs within a given section are used as
  keyword arguments to the factory that can be used as configuration
  for the component being instantiated.

  All sections in the deployment descriptor describe 'middleware'
  except for the last section, which must describe an application.

  Factories which construct middleware must return something which is
  a WSGI callable by implementing the following API::

 def factory(next_app, [**kw]):
  next_app is the next application in the WSGI pipeline,
 **kw is optional, and accepts the key-value pairs
 that are used in the section as a dictionary, used
 for configuration 

  Factories which construct middleware must return something which is
  a WSGI callable by implementing the following API::

 def factory([**kw]):
  **kw is optional, and accepts the key-value pairs
  that are used in the section as a dictionary, used
  for configuration 

  A deployment descriptor can also be parsed from within Python.  An
  importable configurator which resides in 'wsgiref' exposes a
  function that accepts a single argument, configure::

 from wsgiref.runwsgi import parse_deployment
 appchain = parse_deployment('myapplication.wsgi')

  'appchain' will be an object representing the fully configured
  pipeline.  'parse_deployment' is guaranteed to return something
  that implements the WSGI callable API described in PEP 333.

Entry Points

  description of setuptools entry points goes here



On Mon, 2005-07-25 at 10:39 -0400, Phillip J. Eby wrote:
 At 03:02 AM 7/25/2005 -0400, Chris McDonough wrote:
 Actually, let me give this a shot.
 
 We package up an egg called helloworld.egg.  It happens to contain
 something that can be used as a WSGI component.  Let's say it's a WSGI
 application that always returns 'Hello World'.  And let's say it also
 contains middleware that lowercases anything that passes through before
 it's returned.
 
 The implementations of these components could be as follows:
 
 class HelloWorld:
  def __init__(self, app, **kw):
  pass # nothing to configure
 
  def __call__(self, environ, start_response):
  start_response('200 OK', [])
  return ['Hello World']
 
 I'm thinking that an application like this wouldn't take an 'app' 
 constuctor parameter, and if it takes no configuration parameters it 
 doesn't need **kw, but good so far.
 
 
 class Lowercaser:
  def __init__(self, app, **kw):
  self.app = app
  # nothing else to configure
 
  def __call__(self, environ, start_response):
  for chunk in self.app(environ, start_response):
  yield chunk.lower()
 
 Again, no need for **kw if it doesn't take any configuration, but okay.
 
 
 An import map would ship inside of the egg-info dir:
 
 [wsgi.app_factories]
 helloworld = helloworld:HelloWorld
 lowercaser = helloworld:Lowercaser
 
 I'm thinking it would be more like:
 
  [wsgi.middleware]
  lowercaser = helloworld:Lowercaser
 
  [wsgi.apps]
  helloworld = helloworld:HelloWorld
 
 and you'd specify it in the setup script as something like this:
 
  setup(
  #...
  entry_points = {
  'wsgi.apps

Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Chris McDonough

On Mon, 2005-07-25 at 20:29 -0500, Ian Bicking wrote:
  We probably need something like a site map configuration, that can 
  handle tree structure, and can specify pipelines on a per location 
  basis, including the ability to specify pipeline components to be 
  applied above everything under a certain URL pattern.  This is more or 
  less the same as my container API concept, but we are a little closer 
  to being able to think about such a thing.
 
 It could also be something based on general matching rules, with some 
 notion of precedence and how the rule effects SCRIPT_NAME/PATH_INFO.  Or 
 something like that.
How much of this could be solved by using a web server's
directory/alias-mapping facility?

For instance, if you needed a single Apache webserver to support
multiple pipelines based on URL mapping, wouldn't it be possible in many
cases to compose that out of things like rewrite rules and script
aliases (the below assumes running them just as CGI scripts, obviously
it would be different with something using mod_python or what-have-you):

VirtualHost *:80
 ServerAdmin [EMAIL PROTECTED]
 ServerName plope.com
 ServerAlias plope.com
 ScriptAlias /viewcvs /home/chrism/viewcvs.wsgi
 ScriptAlias /blog /home/chrism/blog.wsgi
 RewriteEngine On
 RewriteRule ^/[^/]viewcvs*$ /home/chrism/viewcvs.wsgi [PT]
 RewriteRule ^/[^/]blog*$ /home/chrism/blog.wsgi [PT]
/VirtualHost

Obviously it would mean some repetition in wsgi files if you needed to
repeat parts of a pipeline for each URL mapping.  But it does mean we
wouldn't need to invent more software.


 
  Of course, I still think it's something that can be added *after* having 
  a basic deployment spec.
 
 I feel a very strong need that this be resolved before settling on 
 anything deployment related.  Not necessarily as a standard, but 
 possibly as a set of practices.  Even a realistic and concrete use case 
 might be enough.


I *think* more complicated use cases may revolve around attempting to
use middleware as services that dynamize the pipeline instead of as
oblivious things.  I don't think there's anything really wrong with
that but I also don't think it can ever be specified with as much
clarity as what we've already got because IMHO it's a programming task.

I'm repeating myself, I'm sure, but I'm more apt to put a service
manager piece of middleware in the pipeline (or maybe just implement it
as a library) which would allow my endpoint app to use it to do
sessioning and auth and whatnot.  I realize that is essentially
building a framework (which is reviled lately) but since the endpoint
app needs to collaborate anyway, I don't see a better way to do it
except to rely completely on convention for service lookup (which is
what you seem to be struggling with in the later bits of your post).

- C




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Chris McDonough

On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:

To do this, we use a ConfigParser-format config file named
'myapplication.conf' that looks like this::
  
  [application:sample1]
  config = sample1.conf
  factory = wsgiconfig.tests.sample_components.factory1
  
  [application:sample2]
  config = sample2.conf
  factory = wsgiconfig.tests.sample_components.factory2
  
  [pipeline]
  apps = sample1 sample2
 
 I think it's confusing to call both these applications.  I think 
 middleware or filter would be better.  I think people understand 
 filter far better, so I'm inclined to use that.  So...

The reason I called them applications instead of filters is because all
of them implement the WSGI application API (they all implement a
callable that accepts two parameters, environ and start_response).
Some happen to be gateways/filters/middleware/whatever but at least one
is just an application and does no delegation.  In my example above,
sample2 is not a filter, it is the end-point application.  sample1
is a filter, but it's of course also an application too.

Would you maybe rather make it more explicit that some apps are also
gateways, e.g.:

[application:bleeb]
config = bleeb.conf
factory = bleeb.factory

[filter:blaz]
config = blaz.conf
factory = blaz.factory

?  I don't know that there's any way we could make use of the
distinction between the two types in the configurator other than
disallowing people to place an application before a filter in a
pipeline through validation.  Is there something else you had in mind?

 [application:sample2]
 # What is this relative to?  I hate both absolute paths and
 # paths relative to pwd equally...
 config = sample1.conf
 factory = wsgiconfig...

This was from a doctest I wrote so I could rely on relative paths,
sorry.  You're right.  U... we could probably cause use the
environment as defaults to ConfigParser inerpolation and set whatever
we need before the configurator is run:

$ export APP_ROOT=/home/chrism/myapplication
$ ./wsgi-configurator.py myapplication.conf

And in myapplication.conf:

[application:sample1]
config = %(APP_ROOT)s/sample1.conf
factory = myapp.sample1.factory

That would probably be the least-effort and most flexible thing to do
and doesn't mandate any particular directory structure.  Of course, we
could provide a convention for a recommended directory structure, but
this gives us an out from being painted in to that in specific cases.

 [pipeline]
 # The app is unique and special...?
 app = sample2
 filters = sample1
 
 
 
 Well, that's just a first refactoring; I'm having other inclinations...

I'm not sure whether this is just a stylistic thing or if there's a
reason you want to treat the endpoint app specially.  By definition, in
my implementation, the endpoint app is just the last app mentioned in
the pipeline.

  Potential points of contention
  
   - The WSGI configurator assumes that you are willing to write WSGI
 component factories which accept a filename as a config file.  This
 factory returns *another* factory (typically a class) that accepts
 the next application in the pipeline chain and returns a WSGI
 application instance.  This pattern is necessary to support
 argument currying across a declaratively configured pipeline,
 because the WSGI spec doesn't allow for it.  This is more contract
 than currently exists in the WSGI specification but it would be
 trivial to change existing WSGI components to adapt to this
 pattern.  Or we could adopt a pattern/convention that removed one
 of the factories, passing both the next application and the
 config file into a single factory function.  Whatever.  In any
 case, in order to do declarative pipeline configuration, some
 convention will need to be adopted.  The convention I'm advocating
 above seems to already have been for the current crop of middleware
 components (using a factory which accepts the application as the
 first argument).
 
 I hate the proliferation of configuration files this implies.  I 
 consider the filters an implementation detail; if they each have 
 partitioned configuration then they become a highly exposed piece of the 
 architecture.
 
 It's also a lot of management overhead.  Typical middleware takes 0-5 
 configuration parameters.  For instance, paste.profilemiddleware is 
 perfectly usable with no configuration at all, and only has two parameters.

True.  The config file param should be optional.  Apps might use the
environment to configure themselves.

 But this is reasonably easy to resolve -- there's a perfectly good 
 configuration section sitting there, waiting to be used:
 
[filter:profile]
factory = paste.profilemiddleware.ProfileMiddleware
# Show top 50 functions:
limit = 50
 
 This in no way precludes 'config', which is just a special case of this 
 general configuration.  The only real problem is a possible conflict if 
 we

Re: [Web-SIG] Standardized configuration

2005-07-22 Thread Chris McDonough

I've had a stab at creating a simple WSGI deployment implementation.
I use the term WSGI component in here as shorthand to indicate all
types of WSGI implementations (server, application, gateway).

The primary deployment concern is to create a way to specify the
configuration of an instance of a WSGI component, preferably within a
declarative configuration file.  A secondary deployment concern is to
create a way to wire up components together into a specific
deployable pipeline.  

A strawman implementation that solves both issues via the
configurator, which would be presumed to live in wsgiref. Currently
it lives in a package named wsgiconfig on my laptop.  This module
follows.

 Configurator for establishing a WSGI pipeline 

from ConfigParser import ConfigParser
import types

def configure(path):
config = ConfigParser()
if isinstance(path, types.StringTypes):
config.readfp(open(path))
else:
config.readfp(path)

appsections = []

for name in config.sections():
if name.startswith('application:'):
appsections.append(name)
elif name == 'pipeline':
pass
else:
raise ValueError, '%s is not a valid section name'

app_defs = {}

for appsection in appsections:
app_config_file = config.get(appsection, 'config')
app_factory_name = config.get(appsection, 'factory')
app_name = appsection.split('application:')[1]
if app_config_file is None:
raise ValueError, ('application section %s requires a
config '
   'option' % app_config_file)
if app_factory_name is None:
raise ValueError, ('application %s requires a factory'
   ' option' % app_factory_name)
app_defs[app_name] = {'config':app_config_file,
  'factory':app_factory_name}

if not config.has_section('pipeline'):
raise ValueError, 'must have a pipeline section in config'

pipeline_str = config.get('pipeline', 'apps')
if pipeline_str is None:
raise ValueError, ('must have an apps definition in the '
   'pipeline section')

pipeline_def = pipeline_str.split()

next = None

while pipeline_def:
app_name = pipeline_def.pop()
app_def = app_defs.get(app_name)
if app_def is None:
raise ValueError, ('appname %s os defined in pipeline '
   '%s butno application is defined '
   'with that name')
factory_name = app_def['factory']
factory = import_by_name(factory_name)
config_file = app_def['config']
app_factory = factory(config_file)
app = app_factory(next)
next = app

if not next:
raise ValueError, 'no apps defined in pipeline'
return next

def import_by_name(name):
if not . in name:
raise ValueError(unloadable name:  + `name`)
components = name.split('.')
start = components[0]
g = globals()
package = __import__(start, g, g)
modulenames = [start]
for component in components[1:]:
modulenames.append(component)
try:
package = getattr(package, component)
except AttributeError:
n = '.'.join(modulenames)
package = __import__(n, g, g, component)
return package

  We configure a pipeline based on a config file, which
  creates and chains two sample WSGI applications together.

  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

[application:sample1]
config = sample1.conf
factory = wsgiconfig.tests.sample_components.factory1

[application:sample2]
config = sample2.conf
factory = wsgiconfig.tests.sample_components.factory2

[pipeline]
apps = sample1 sample2

  The configurator exposes a function that accepts a single argument,
  configure.

 from wsgiconfig.configurator import configure
 appchain = configure('myapplication.conf')

  The sample_components module referred to in the
  'myapplication.conf' file application definitions might look like
  this::

  class sample1:
   middleware 
  def __init__(self, app):
  self.app = app
  def __call__(self, environ, start_response):
  environ['sample1'] = True
  return self.app(environ, start_response)

  class sample2:
end-point app 
  def __init__(self, app):
  self.app = app

  def __call__(self, environ, start_response):
  environ['sample2'] = True
  return ['return value

Re: [Web-SIG] Standardized configuration

2005-07-19 Thread Chris McDonough

On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
 In addition to the examples I gave in response to Graham, I wrote a 
 document on this a while ago: 
 http://pythonpaste.org/docs/url-parsing-with-wsgi.html
 
 The hard part about this is configuration; it's easy to configure a 
 non-branching chain of middleware.  Once it branches the configuration 
 becomes hard (like programming-hard; which isn't *hard*, but it quickly 
 stops feeling like configuration).

Yep.  I think I'm getting it.  For example, I see that Paste's URLParser
seems to *construct* applications if they don't already exist based on
the URL.  And I assume that these applications could themselves be
middleware.  I don't think that is configurable declaratively if you
want to decide which app to use based on arbitrary request parameters.

But if we already had the config for each app instance that URLParser
wanted to consult laying around as files on disk, wouldn't it be just as
easy to construct these app objects eagerly at startup time?  Then you
URLParser could choose an already-configured app based on some sort of
configuration file in the URLParser component itself.  The apps
themselves may be pipelines, too, I realize that, but that is still
configurable without coding.

Maybe there'd be some concern about needing to stop the process in order
to add new applications.  That's a use case I hadn't really considered.
I suspect this could be done with a signal handler, though, which could
tell the URLParser to reload its config file instead of potentially
locating a and creating a new application within every request.

This would make URLParser a kind of decision middleware, but it would
choose from a static set of existing applications (or pipelines) for the
lifetime of the process as opposed to constructing them lazily.

  OTOH, I'm not sure that I want my framework to find an app for me.
  I'd like to be able to define pipelines that include my app, but I'd
  typically just want to statically declare it as the end point of a
  pipeline composed of service middleware.  I should look at Paste a
  little more to see if it has the same philosophy or if I'm
  misunderstanding you.
 
 Mostly I wanted to avoid lots of magical incantations for the simple 
 case.  If you are used to Webware, well it has a very straight-forward 
 way of finding your application -- you give it a directory name.  If 
 Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
 ZEO connection string, and so on.

I think I understand now.

In general, I think I'd rather create instance locations of WSGI
applications (which would essentially consist of a config file on disk
plus any state info required by the app), configure and construct Python
objects out of those instances eagerly at startup time and just choose
between already-constructed apps if in decision middleware that has
its own declarative configuration if decisions need to be made about
which app to use.

This is mostly because I want the configuration info to live within the
application/middleware instance and have some other starter import
those configurations from application/middleware instance locations on
the filesystem.  The starter would construct required instances as
Python objects, and chain them together arbitrarily based on some other
pipeline configuration file that lives with the starter.  The first
part of that (construct required instances) is described in a post I
made to this list yesterday.

This is probably because I'd like there to be one well-understood way to
declaratively configure pipelines as opposed to each piece of middleware
potentially needing to manage app construction and having its own
configuration to do so.

I don't know if this is reasonable for simpler requirements.  This is
more of a formal deployment spec idea and of course is likely flawed
in some subtle way I don't understand yet.

  I'm pretty sure you're not advocating it, but in case you are, I'm not
  sure it adds as much value as it removes to be able to have a dynamic
  middleware chain whereby new middleware elements can be added on the
  fly to a pipeline after a request has begun.  That is *very* late
  binding to me and it's impossible to configure declaratively.
 
 I'm comfortable with a little of both.  I don't even know *how* I'd stop 
 dynamic middleware.  For instance, one of the methods I added to Wareweb 
 recently allows any servlet to forward to any WSGI application; but from 
 the outside the servlet looks like a normal WSGI application just like 
 before.

It's obviously fine if applications themselves want to do this.  I'm not
sure that it would be possible to create a deployment spec that
canonized *how* to do it because as you mentioned it's not really a
configuration task, it's a programming task.

  I agree!  I'm a bit confused because one of the canonical examples of
  how WSGI middleware is useful seems to be the example of implementing a
  framework-agnostic

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Chris McDonough

On Sun, 2005-07-17 at 03:16 -0500, Ian Bicking wrote:
 This is what Paste does in configuration, like:
 
 middleware.extend([
  SessionMiddleware, IdentificationMiddleware,
  AuthenticationMiddleware, ChallengeMiddleware])
 
 This kind of middleware takes a single argument, which is the 
 application it will wrap.  In practice, this means all the other 
 parameters go into lazily-read configuration.

I'm finding it hard to imagine a reason to have another kind of
middleware.

Well, actually that's not true.  In noodling about this, I did think it
would be kind of neat in a twisted way to have decision middleware
like:

class DecisionMiddleware:
 def __init__(self, apps):
 self.apps = apps

 def __call__(self, environ, start_response):
app = self.choose(environ)
for chunk in app(environ, start_response):
yield chunk

 def choose(self, environ):
 app = some_decision_function(self.apps, environ)

I can imagine using this pattern as a decision point for a WSGI pipeline
serving multiple application end-points (perhaps based on URL matching
of the PATH_INFO in environ).

But by and large, most middleware components seem to be just wrappers
for the next application in the chain.  There seem to be two types of
middleware that takes a single application object as a parameter to its
constructor.  There is decorator middleware where you want to add
something to the environment for an application to find later and
action middleware that does some rewriting of the body or the response
headers before the response is sent back to the client.  Some of this
kind of middleware does both.

 You can also define a framework (a plugin to Paste), which in addition 
 to finding an app can also add middleware; basically embodying all the 
 middleware that is typical for a framework.

This appears to be what I'm trying to do too, which is why I'm intrigued
by Paste.

OTOH, I'm not sure that I want my framework to find an app for me.
I'd like to be able to define pipelines that include my app, but I'd
typically just want to statically declare it as the end point of a
pipeline composed of service middleware.  I should look at Paste a
little more to see if it has the same philosophy or if I'm
misunderstanding you.

 Paste is really a deployment configuration.  Well, that as well as stuff 
 to deploy.  And two frameworks.  And whatever else I feel a need or 
 desire to throw in there.

Yeah.  FWIW, as someone who has recently taken a brief look at Paste, I
think it would be helpful (at least for newbies) to partition out the
bits of Paste which are meant to be deployment configuration from the
bits that are meant to be deployed.  Zope 2 fell into the same trap
early on, and never recovered.  For example, ZPublisher (nee Bobo) was
always meant to be able to be useful outside of Zope, but in practice it
never happened because nobody could figure out how to disentangle it
from its ever-increasing dependencies on other software only found in a
Zope checkout.  In the end, nobody even remembered what its dependencies
were *supposed* to be.  If you ask ten people, you'd get ten different
answers.

I also think that the rigor of separating out different components helps
to make the software stronger and more easily understood in bite-sized
pieces.  Unfortunately, separating them makes configuration tough, but I
think that's what we're trying to find an answer about how to do the
right way here.

 Note also that parts of the pipeline are very much late bound.  For 
 instance, the way I implemented Webware (and Wareweb) each servlet is a 
 WSGI application.  So while there's one URLParser application, the 
 application that actually handles the request differs per request.  If 
 you start hanging more complete applications (that might have their own 
 middleware) at different URLs, then this happens more generally.

Well, if you put the decider in middleware itself, all of the
middleware components in each pipeline could still be at least
constructed early.  I'm pretty sure this doesn't really strictly qualify
as early binding but it's not terribly dynamic either.  It also makes
configuration pretty straightforward.  At least I can imagine a
declarative syntax for configuring pipelines this way.

I'm pretty sure you're not advocating it, but in case you are, I'm not
sure it adds as much value as it removes to be able to have a dynamic
middleware chain whereby new middleware elements can be added on the
fly to a pipeline after a request has begun.  That is *very* late
binding to me and it's impossible to configure declaratively.

  But some elements of the pipeline at this level of factoring do need to
  have dependencies on availability and pipeline placement of the other
  elements.  In this example, proper operation of the authentication
  component depends on the availability and pipeline placement of the
  identification component.  Likewise, the identification component may

[Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough

I've also been putting a bit of thought into middleware configuration,
although maybe in a different direction.  I'm not too concerned yet
about being able to introspect the configuration of an individual
component.  Maybe that's because I haven't thought about the problem
enough to be concerned about it.  In the meantime, though, I *am*
concerned about being able to configure a middleware pipeline easily
and have it work.

I've been attempting to divine a declarative way to configure a pipeline
of WSGI middleware components.  This is simple enough through code,
except that at least in terms of how I'm attempting to factor my
middleware, some components in the pipeline may have dependencies on
other pipeline components.

For example, it would be useful in some circumstances to create separate
WSGI components for user identification and user authorization.  The
process of identification -- obtaining user credentials from a request
-- and user authorization  -- ensuring that the user is who he says he
is by comparing the credentials against a data source -- are really
pretty much distinct operations.  There might also be a challenge
component which forces a login dialog.

In practice, I don't know if this is a truly useful separation of
concerns that need to be implemented in terms of separate components in
the middleware pipeline (I see that paste.login conflates them), it's
just an example.  But at very least it would keep each component simpler
if the concerns were factored out into separate pieces.

But in the example I present, the authentication component depends
entirely on the result of the identification component.  It would be
simple enough to glom them together by using a distinct environment key
for the identification component results and have the authentication
component look for that key later in the middleware result chain, but
then it feels like you might as well have written the whole process
within one middleware component because the coupling is pretty strong.

I have a feeling that adapters fit in here somewhere, but I haven't
really puzzled that out yet.  I'm sure this has been discussed somewhere
in the lifetime of WSGI but I can't find much in this list's archives.

 Lately I've been thinking about the role of Paste and WSGI and
 whatnot. Much of what makes a Paste component Pastey is
 configuration;  otherwise the bits are just independent pieces of
 middleware, WSGI applications, etc.  So, potentially if we can agree
 on configuration, we can start using each other's middleware more
 usefully.

 I think we should avoid questions of configuration file syntax for
 now.  Lets instead simply consider configuration consumers.  A
 standard would consist of:

 * A WSGI environment key (e.g., 'webapp01.config')
 * A standard for what goes in that key (e.g., a dictionary object)
 * A reference implementation of the middleware
 * Maybe a non-WSGI-environment way to access the configuration (like 
 paste.CONFIG, which is a global object that dispatches to per-request 
 configuration objects) -- in practice this is really really useful, as 
 you don't have to pass the configuration object around.

 There's some other things we have to consider, as configuration syntaxes 
 do effect the configuration objects significantly.  So, the standard for 
 what goes in the key has to take into consideration some possible 
 configuration syntaxes.

 The obvious starting place is a dictionary-like object.  I would suggest 
 that the keys should be valid Python identifiers.  Not all syntaxes 
 require this, but some do.  This restriction simply means that 
 configuration consumers should try to consume Python identifiers.

 There's also a question about name conflicts (two consumers that are 
 looking for the same key), and whether nested configuration should be 
 preferred, and in what style.

 Note that the standard we decide on here doesn't have to be the only way 
 the object can be accessed.  For instance, you could make your 
 configuration available through 'myframework.config', and create a 
 compliant wrapper that lives in 'webapp01.config', perhaps even doing 
 different kinds of mapping to fix convention differences.

 There's also a question about what types of objects we can expect in the 
 configuration.  Some input styles (e.g., INI and command line) only 
 produce strings.  I think consumers should treat strings (or maybe a 
 special string subclass) specially, performing conversions as necessary 
 (e.g., 'yes'-True).

 Thoughts?



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough

On Sat, 2005-07-16 at 23:29 -0500, Ian Bicking wrote:
 There's nothing in WSGI to facilitate introspection.  Sometimes that 
 seems annoying, though I suspect lots of headaches are removed because 
 of it, and I haven't found it to be a stopper yet.  The issue I'm 
 interested in is just how to deliver configuration to middleware.

Whew, I hoped you'd respond. ;-)

It appears that I haven't gotten as far as to want introspection into
the implementation or configuration of a middleware component.  Instead,
I want the ability to declaratively construct a pipeline out of largely
opaque and potentially interdependent (but loosely coupled) WSGI
middleware components, which is another problem entirely.  It seemed
cogent, so I just somewhat belligerently coopted this thread, sorry!

 Because middleware can't be introspected (generally), this makes things 
 like configuration schemas very hard to implement.  It all needs to be 
 late-bound.

The pipeline itself isn't really late bound.  For instance, if I was to
create a WSGI middleware pipeline something like this:

   server -- session -- identification -- authentication -- 
   -- challenge -- application

... session, identification, authentication, and challenge are
middleware components (you'll need to imagine their implementations).
And within a module that started a server, you might end up doing
something like:

def configure_pipeline(app):
return SessionMiddleware(
IdentificationMiddleware(
  AuthenticationMiddleware(
ChallengeMiddleware(app)

if __name__ == '__main__':
app = Application()
pipeline = configure_pipeline(app)
server = Server(pipeline)
server.serve()

The pipeline is static.  When a request comes in, the pipeline itself is
already constructed.  I don't really want a way to prevent improper
pipeline construction at startup time (right now anyway), because
failures due to missing dependencies will be fairly obvious.

But some elements of the pipeline at this level of factoring do need to
have dependencies on availability and pipeline placement of the other
elements.  In this example, proper operation of the authentication
component depends on the availability and pipeline placement of the
identification component.  Likewise, the identification component may
depend on values that need to be retrieved from the session component.

I've just seen Phillip's post where he implies that this kind of
fine-grained component factoring wasn't really the initial purpose of
WSGI middleware.  That's kind of a bummer. ;-)

Factoring middleware components in this way seems to provide clear
demarcation points for reuse and maintenance.  For example, I imagined a
declarative security module that might be factored as a piece of
middleware here:  http://www.plope.com/Members/chrism/decsec_proposal .

Of course, this sort of thing doesn't *need* to be middleware.  But
making it middleware feels very right to me in terms of being able to
deglom nice features inspired by Zope and other frameworks into pieces
that are easy to recombine as necessary.  Implementations as WSGI
middleware seems a nice way to move these kinds of features out of our
respective applications and into more application-agnostic pieces that
are very loosely coupled, but perhaps I'm taking it too far.

  For example, it would be useful in some circumstances to create separate
  WSGI components for user identification and user authorization.  The
  process of identification -- obtaining user credentials from a request
  -- and user authorization  -- ensuring that the user is who he says he
  is by comparing the credentials against a data source -- are really
  pretty much distinct operations.  There might also be a challenge
  component which forces a login dialog.
 
 I've always thought that a 401 response is a good way of indicating 
 that, but not everyone agrees.  (The idea being that the middleware 
 catches the 401 and possibly translates it into a redirect or something.)

Yep.  That'd be a fine signaling mechanism.

  In practice, I don't know if this is a truly useful separation of
  concerns that need to be implemented in terms of separate components in
  the middleware pipeline (I see that paste.login conflates them), it's
  just an example.  
 
 Do you mean identification and authentication (you mention authorization 
 above)? 

Aggh.  Yes, I meant to write authentication, sorry.

  I think authorization is different, and is conflated in 
 paste.login, but I don't have any many use cases where it's a useful 
 distinction.  I guess there's a number of ways of getting a username and 
 password; and to some degree the  authenticator object works at that 
 level of abstraction.  And there's a couple other ways of authenticating 
 a user as well (public keys, IP address, etc).  I've generally used a 
 user manager object for this kind of abstraction, with subclassing for 
 different kinds of generality (e.g., the basic abstract

52 matches

Mail list logo