Re: [Web-SIG] Communicating authenticated user information

2006-01-25 Thread Clark C. Evans
Uncle! Uncle!

On Wed, Jan 25, 2006 at 12:17:29PM -0500, Phillip J. Eby wrote:
| If each middleware or application does this:
| 
| remote_user = environ.setdefault('paste.remote_user', [])
| 
| And then uses the contents of that list as the thing to check or modify, 
| then you will get the exact same result as the pass the same environ 
| approach, except that it's actually compatible with PEP 333, as opposed 
| to relying on implementation accidents.

Ok, assuming that we want an extension API for this sort of thing; I'd
rather have a bit more general solution. At the very least, it would be
nice to have a unified way to pass the REMOTE_USER up the WSGI stack so
that each WSGI middleware toolkit doesn't have to roll their own (ie,
paste.remote_user and zope.set_user).  But ideally, the solution should
handle more than just REMOTE_USER since I need to track session
identifiers and other environment changes.  Here is a proposal.

  wsgi.notify(key, value)

This optional environment variable is a function used to notify
previous stages of processing about a change in the ``environ``.
Authentication middleware components, for example, would want to
do something like:

  if environ.get('wsgi.notify'):
 environ.get('wsgi.notify')('REMOTE_USER','foo')
  environ['REMOTE_USER'] = 'foo'
 
when setting an common environment variable which may be useful
to previous processing stages.  Prior stages may then 
watch for particularly important changes by replacing this
function, making sure to call prior instances, like:

  class Logger:
 def __init__(self, application):
 self.application = application
 self.user_counts = {}
 def __call__(self, environ, start_response):
 prev_notify = environ.get('wsgi.notify', lambda k,v: None)
 def notify(k,v):
 if 'REMOTE_USER' == k:
 environ['bing.user'] = v
 prev_notify(k,v)
 def _start_response(status, response_headers, exce_info=None):
 user = environ.get('bing.user','anonymous')
 self.access[user] = self.access.get(user,0) + 1 
 return start_response(status, response_headers, exce_info)
 environ['wsgi.notify'] = notify
 return self.application(environ, _start_response)

| The WSGI middleware components that actually create their own environ
| are few and far between.  This is an uncommon edge case.
| 
| Composability of applications is a critical requirement for WSGI 
| middleware.  It doesn't matter how uncommon it is.  Even if there were 
| *zero* implementations of such middleware right now, that principle 
| would take precedence, meaning you'd have to have a proposal that would 
| preserve composability.  Right now, you haven't described a way to do 
| that without introducing temporal coupling (or worse) among subrequests.

If you assume a single thread of control; ie, all sub-requests are done
sequentially, then extension APIs share all of the pitfalls as
mandating a single ``environ``.  I've demonstrated how this is possible
in an earlier message.

However, in a *threaded* environment, the approach I proposed is
unworkable if sub-requests are executed in parallel.  In this case,
strange and nasty consequences would exist if multiple sub-applications
were accessing the same ``environ`` dict.  It is for this reason that
I'm throwing in the towel.

I hope something like the proposal above; or my other attempt to 
formalize a response-based approach are closer to your liking.

Best Wishes,

Clark
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Stephan Richter
On Monday 23 January 2006 22:15, Clark C. Evans wrote:
 On Mon, Jan 23, 2006 at 04:15:06PM -0500, Phillip J. Eby wrote:
 | At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
 |  Specify a new environment variable called 'wsgi.user' (or something
 |  similar) that is a mutable and can be written several times. Only
 |  the last write (before the output is sent) is important. By default
 |  the variable is set to ``None`` for not set.

 Why not ``wsgi.context`` or something like that which defaults to
 an empty dictionary.  Then you can put what ever you want in it;
 ``wsgi.user`` just seems to be a bit too specific.

But if you use a dictionary you need to specify all allowed keys. The server 
needs to know from the standard (WSGI) what it is looking for. The twisted 
guys and us have thought about other possible data for logging and we could 
not come up with any. If you have real use cases for other data, please let 
me know.

 | I'd suggest a callable under 'wsgi.log_username', that takes one
 | argument.

 I think this is way too specific; it doesn't address the general
 problem: how do you pass information back up the middleware stack.

You cannot address this issue generally. The point of WSGI is that it is a 
well-defined API that specifies exactly what to expect. Let's take your 
suggestion. Let's say there is a dictionary that can contain anything. Zope 3 
(acting as the application) decides to put a key named user into the 
dictionary. But Twisted (acting as the server) looks for remote-user. Since 
the key is not specified in the specification, we have gained absolutely 
nothing.

 | It should be specified whether it requires ASCII or Unicode.

 Why cannot it just accept a Python string?  You can always check
 if it is Unicode or not.

Because encoding might be arbitrary. It has to be clearly specified in the 
specs what to expect.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics  Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Phillip J. Eby
At 10:15 PM 1/23/2006 -0500, Clark C. Evans wrote:
On Mon, Jan 23, 2006 at 04:15:06PM -0500, Phillip J. Eby wrote:
| At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
|  Specify a new environment variable called 'wsgi.user' (or something
|  similar) that is a mutable and can be written several times. Only
|  the last write (before the output is sent) is important. By default
|  the variable is set to ``None`` for not set.

Why not ``wsgi.context`` or something like that which defaults to
an empty dictionary.  Then you can put what ever you want in it;
``wsgi.user`` just seems to be a bit too specific.

We want to be specific, as it wouldn't be a very good specific-ation 
otherwise.  :)


| I'd suggest a callable under 'wsgi.log_username', that takes one
| argument.

I think this is way too specific; it doesn't address the general
problem: how do you pass information back up the middleware stack.

There is no general problem which anyone is trying to solve.  The use 
case requested by Jim and Stephan is quite specific.


| It should be specified whether it requires ASCII or Unicode.

Why cannot it just accept a Python string?  You can always check
if it is Unicode or not.

I'm pointing out that the use case under consideration isn't specific 
*enough* yet.  Do people's log files support unicode?  Do the 
authentication systems?  This hasn't been made clear, and it should be.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Jim Fulton
Phillip J. Eby wrote:
...
 I'm pointing out that the use case under consideration isn't specific 
 *enough* yet.  Do people's log files support unicode?  Do the 
 authentication systems?  This hasn't been made clear, and it should be.

I agree.  I think we should be guided by the common log file format.
Log data are written to files and are thus not unicode. The user
info is *just* documentation, so it is really up to the app what to
show imo.  Further, because the common log file format is space
delimited, the user info cannot contain spaces.

Jim

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Phillip J. Eby
At 10:30 PM 1/23/2006 -0500, Clark C. Evans wrote:
Suggested Wording:

A WSGI Middleware component (that is, one that receives a
request and forwards it on to another component) must forward
on the *exact* same ``environ`` dict that it received.

-1.  This invalidates current WSGI design principles and can't go in any 
WSGI 1.x version, and even for a WSGI 2.x it would need a heck of a lot 
more justification.

Note that WSGI is an HTTP analogue, it is not a web server API.  In the 
context of this discussion, I'm now more convinced than ever that the right 
place to communicate information back to the server is via response 
headers, and that's how this use case should be addressed in WSGI 1.1, as 
it maintains the functional composition of middleware better than an 
environ-supplied extension.  In WSGI the design principle needs to be 
Isolation beats cleanliness.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Clark C. Evans
On Tue, Jan 24, 2006 at 11:33:56AM -0500, Phillip J. Eby wrote:
|  I think this is way too specific; it doesn't address the general
|  problem: how do you pass information back up the middleware stack.

| There is no general problem which anyone is trying to solve.  The use 
| case requested by Jim and Stephan is quite specific.

Yes there is; it is passing information from applications back to 
middleware (or the server), you even talk about it yourself:

  | I'm now more convinced than ever that the right place to 
  | communicate information back to the server is via response
  | headers, and that's how this use case should be addressed in 
  | WSGI 1.1, as it maintains the functional composition of middleware 

If this is the solution, great.  However, I really don't like the
``environ`` options out there /w mutable objects.  Can you please
then specify a *general* mechanism for headers that won't be 
sent to the client?  My server needs to know which ones to strip.

On Tue, Jan 24, 2006 at 10:53:53AM -0600, Ian Bicking wrote:
| Jim Fulton wrote:
|  Phillip J. Eby wrote:
| I'm pointing out that the use case under consideration isn't specific 
| *enough* yet.  Do people's log files support unicode?  Do the 
| authentication systems?  This hasn't been made clear, and it should be.
|  
|  I agree.  I think we should be guided by the common log file format.
|  Log data are written to files and are thus not unicode. The user
|  info is *just* documentation, so it is really up to the app what to
|  show imo.  Further, because the common log file format is space
|  delimited, the user info cannot contain spaces.
| 
| It is up to the consumer to handle any unicode, and to maintain the 
| integrity of their log format regardless of input.

I second Ian's opinion.  I have to log Russian user-names and web-pages,
internally I use Unicode strings; and when writing to common log file
format, I simply urlencode the string.  This takes care of spaces and
non-ASCII code points.

Thus, the WSGI specification should not restrict the character set,
since some other logging middleware might want to use XML(UTF-8) or
write each access to a database that is unicode aware.  The value
should be *any* python string object; let the logging module determine
the type and encoding and handle it as needed.

On Tue, Jan 24, 2006 at 12:35:48PM -0500, Michal Wallace wrote:
| I think you guys are trying to solve this at the wrong level. 
| This problem should be handled by the web server itself.

People are writing features like this as specific middleware
components so that you don't have a bloated web-server.  

| Maybe I just don't understand why this is important. Can 
| someone (Jim) explain why this is a requirement in the 
| first place?

Well, the general problem is how to communicate information from
applications back to the middleware or server.  This is one use
case; there are others, I am sure.

On Tue, Jan 24, 2006 at 07:31:35AM -0500, Stephan Richter wrote:
|  Why not ``wsgi.context`` or something like that which defaults to
|  an empty dictionary. ?Then you can put what ever you want in it;
|  ``wsgi.user`` just seems to be a bit too specific.
| 
| But if you use a dictionary you need to specify all allowed keys.

Fine, use REMOTE_USER as this is a CGI standard.

| The twisted guys and us have thought about other possible data for
| logging and we could not come up with any. If you have real use cases
| for other data, please let me know.

Depends on the logging level;

0. Trace messages
1. The database instance name used 
2. A sequence of SQL queries executed
3. Files that were created by the request
etc.

I can think of a lot of things I might want to log (and in fact do log);
that said, I'm not interested in this specific application -- I want
the general case spelled-out.  How do I pass information from the
application back to the middleware reliably?

|  Why cannot it just accept a Python string? ?You can always check
|  if it is Unicode or not.
| 
| Because encoding might be arbitrary. It has to be clearly specified in the 
| specs what to expect.

It's very easy for the logging module to check what it has and act
intelligently.   We arn't using C89...

Best,

Clark
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Phillip J. Eby
At 12:35 PM 1/24/2006 -0500, Michal Wallace wrote:
Maybe I just don't understand why this is
important. Can someone (Jim) explain why this
is a requirement in the first place?

I'd like to know too, although the obvious argument is backward 
compatibility for people accustomed to ZServer as Zope migrates away from it.

I've personally never felt a need to feed this data back to the web server, 
probably because I'm so used to using FastCGI, which has no *way* to feed 
it back to the web server, and I prefer to look at the application's 
logs.  But Zope is a content management system, not an application in the 
sense I mean, so the same use cases don't necessarily apply.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Clark C. Evans
On Tue, Jan 24, 2006 at 05:34:19PM -0500, Phillip J. Eby wrote:
| By turning that narrowly-stated issue into a general problem, you're 
| dissolving three dimensions of specificity at once: i.e., you're turning 
| the problem into essentially communicating something about anything to 
| anybody, which no longer carries any useful information for making 
| design tradeoffs, especially since you are not presenting any 
| alternative use cases that present examples with different values along 
| any of the generalized dimensions.  This is unsound design practice, and 
| it simply leads to people jabbering misunderstandings at each other 
| while they think they're communicating.  Let's stick to real-life use 
| cases, please, not theoretical ones.  In the meantime, extension APIs as 
| provided for by the existing PEP present an adequate ad hoc upstream 
| communications facility.

Nice sermon; now can we get back to the issue being discussed without
being argumentative and santimonious?

Another use case for passing information up the WSGI stack is is where
you have two 'othogonal' but decoupled modules, each of which have a
role/interface that could be implemented by an equivalent replacement:

  'paste.auth.digest'   
 This does authentication handling, sending a 401 back to 
 the server if REMOTE_USER is not already filled in.

  'paste.auth.cookie'
 This looks for a cookie and injects REMOTE_USER into the
 environ on the way down; it then looks for a REMOTE_USER
 to save via a cookie on the way up.  It is a simple and
 elegant mechanism.

However, this implementation violates your vision of WSGI, since I am
assuming that the later stacks will pass along the current environment:

On Mon, Jan 23, 2006 at 02:25:35PM -0500, Phillip J. Eby wrote:
| You simply can't use environ values to communicate *up*
| the WSGI stack, since at no level is it guaranteed you
| have the same dictionary.  Response headers and
| callables (or mutables) in the environ are the only way to
| send stuff upstream.  You also have to be careful that any
| upstream communication doesn't bypass something that
| middleware should be allowed to control.
|
| In the case of authentication, it should be sufficient to
| have a callable or mutable in the environ that can be
| called or set more than once per request, i.e. it only
| takes effect once the request is completed.  This allows
| outer middleware to override what inner middleware or the
| application set it to. 

The problem with fixing my implementation with this approach is
that it unnecessarly couples cookie and digest modules.  I don't
think it is necessary nor a good idea to have decoupled modules
dependent on each other via a callable in the ``environ``.
So, I reject this approach, and I suggested that the same ``environ`` 
object should be passed all the way down the WSGI stack.

On Tue, Jan 24, 2006 at 11:41:04AM -0500, Phillip J. Eby wrote:
| At 10:30 PM 1/23/2006 -0500, Clark C. Evans wrote:
| Suggested Wording:
| 
|A WSGI Middleware component (that is, one that receives a
|request and forwards it on to another component) must forward
|on the *exact* same ``environ`` dict that it received.
| 
| -1.  This invalidates current WSGI design principles and can't
| go in any WSGI 1.x version, and even for a WSGI 2.x it would 
| need a heck of a lot more justification.

Having the *same* ``environ`` passed all the way up the stack works --
nicely.  I've not yet seen a rationale why WSGI should not have this
limitation; Ian presented 2 use cases in paste where a different environ
is passed down the stack, however, both of his cases can be fixed (as I
demonstrated) to be compliant with the suggested wording above.

On Tue, Jan 24, 2006 at 11:41:04AM -0500, Phillip J. Eby wrote:
| Note that WSGI is an HTTP analogue, it is not a web server
| API.  In the context of this discussion, I'm now more
| convinced than ever that the right place to communicate
| information back to the server is via response headers,
| and that's how this use case should be addressed in WSGI
| 1.1, as it maintains the functional composition of
| middleware better than an environ-supplied extension.  In
| WSGI the design principle needs to be Isolation beats
| cleanliness.

Well, regardless of what you intended of WSGI, it is a web server API;
and a particularly good low-level one.  The current usage I have of
using the ``environ`` to pass information *up* does provide a great deal
of isolation, and the solutions so far don't have the same advantages.

On Tue, Jan 24, 2006 at 05:34:19PM -0500, Phillip J. Eby wrote:
| On Tue, Jan 24, 2006 at 10:53:53AM -0600, Ian Bicking wrote:
| | It is up to the consumer to handle any unicode, and to maintain the
| | integrity of their log format regardless of input.
| 
| I second Ian's opinion.
| 
| 

Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Phillip J. Eby
At 09:42 PM 1/24/2006 -0500, Clark C. Evans wrote:
Nice sermon; now can we get back to the issue being discussed without
being argumentative and santimonious?

I didn't notice anyone being either of those.  As for the sermon, however, 
I'm glad you enjoyed it.  :)


Another use case for passing information up the WSGI stack is is where
you have two 'othogonal' but decoupled modules, each of which have a
role/interface that could be implemented by an equivalent replacement:

   'paste.auth.digest'
  This does authentication handling, sending a 401 back to
  the server if REMOTE_USER is not already filled in.

   'paste.auth.cookie'
  This looks for a cookie and injects REMOTE_USER into the
  environ on the way down; it then looks for a REMOTE_USER
  to save via a cookie on the way up.  It is a simple and
  elegant mechanism.

I don't see why an extension API placed in the environ, such as 
paste.auth.set_user doesn't satisfy this use case.


The problem with fixing my implementation with this approach is
that it unnecessarly couples cookie and digest modules.

You lost me.  How does it do that in any way that the 'REMOTE_USER' 
variable does not?


So, I reject this approach, and I suggested that the same ``environ``
object should be passed all the way down the WSGI stack.

And as I've already said, this simply isn't possible in WSGI 1.x, as it's 
not backward compatible.  That needs to be a 2.x revision, if it happens at 
all.


Having the *same* ``environ`` passed all the way up the stack works --
nicely.

So do extension APIs.


   I've not yet seen a rationale why WSGI should not have this
limitation;

Because WSGI is designed for functional composability.  Requiring environ 
passthrough breaks that by creating a global coupling.  If anything, in a 
2.x WSGI version I would lean towards getting rid of extension APIs and 
replacing them with some kind of additional response facility, as it's 
still too easy to create global coupling or to bypass middleware via 
extension APIs.


Ian presented 2 use cases in paste where a different environ
is passed down the stack, however, both of his cases can be fixed (as I
demonstrated) to be compliant with the suggested wording above.

So we can make it harder for people to write middleware, in order to make 
it easier for people to introduce global coupling?  That doesn't sound like 
a useful tradeoff -- certainly not one that overcomes the cost of changing 
the spec.


Well, regardless of what you intended of WSGI, it is a web server API;
and a particularly good low-level one.  The current usage I have of
using the ``environ`` to pass information *up* does provide a great deal
of isolation, and the solutions so far don't have the same advantages.

Not so.  It's just as easy to create a 'paste.remote_user' environ key that 
contains a 1-element list with a value in it, if you insist on having 
global coupling.  That works today with the existing spec and likely always 
will, is trivial to implement, and requires no fixing of existing 
middleware that isn't broken.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-24 Thread Clark C. Evans
On Wed, Jan 25, 2006 at 12:41:01AM -0500, Michal Wallace wrote:
| Unfortunately, if you require it to be the exact same 
| *object* then you're making the requirement that 
| everything in the stack happens in the same process, 
| on the same machine. 

Correct.  Phillip's extension APIs approach has the same short-coming;
it does seem that using response headers is the only sane way to go
about solving this problem.

| That means you can't distribute the magic over xml-rpc or SOAP
| or some other protocol, and you might want to do that if you're
| using a load balancing feature or want part of the system
| to run as a different user.

In other words, each WSGI component in a stack should, ideally, not be
dependent upon mutable objects and should only use values that can be
passed by value. It's additional work; but I'll buy that one -- I just
don't buy the idea that extension APIs are superior to just requring the
``environ`` be constant through a given request.

This seems to be where Phillip is headed:

On Tue, Jan 24, 2006 at 11:37:30PM -0500, Phillip J. Eby wrote:
| WSGI is designed for functional composability.  Requiring
| environ passthrough breaks that by creating a global coupling.
| If anything, in a 2.x WSGI version I would lean towards getting
| rid of extension APIs and replacing them with some kind of
| additional response facility, as it's still too easy to create
| global coupling or to bypass middleware via extension APIs.


On Tue, Jan 24, 2006 at 11:37:30PM -0500, Phillip J. Eby wrote:
| I don't see why an extension API placed in the environ, such as 
| paste.auth.set_user doesn't satisfy this use case.

I must not have explained the modules clear enough; sorry for the
repetition, but let me take another stab at it.  

I have several authentication modules, one for HTML form authentication,
basic, digest, and quite a few others.  The function of these modules is
to ensure that environ['REMOTE_USER'] exists.  If a remote user is
already provided, they are a no-op.  Otherwise, they do what is
necessary (a 401, 302, returning an HTML form, etc.) in order to
get a remote user and fill in the environ.

Then I have a class of restoration modules, one which uses a signed key,
and another one that does path re-writing.  These modules look to see if
they have enough information to fill in a REMOTE_USER, if not, they are
a no-op on the way in.  On the way out, however, they *look* at the 
``environ`` to see if REMOTE_USER was set -- if it was set they do 
what ever they need to *save* this information.

Hence, the interfaces between these modules is simply using the
well-understood CGI variable ``REMOTE_USER``.  They can be used
independently of each other, and in creative combinations.

| You lost me.  How does it do that in any way that the 'REMOTE_USER' 
| variable does not?

Let's talk about both sorts of modules independently.  First, the CGI
variable 'REMOTE_USER' is already well documented; and the goal of the
authentication modules is simple -- fill in that environment variable.
Your approach requires that an additional activity/burden is imposed on
these sorts of modules.

I agree that the cookie module isn't quite as straight-forward, but
it isn't that bad.  Since the authentication modules are already
filling in the 'REMOTE_USER' to meet the expectations of standard
software components, it makes sense to obtain that inforamtion
directly from the environment.

In summary, I think extension APIs are more brittle and are a 
poor substitute for just using a shared environ both up and
down the WSGI stack.

| So, I reject this approach, and I suggested that the same ``environ``
| object should be passed all the way down the WSGI stack.
| 
| And as I've already said, this simply isn't possible in WSGI 1.x, as 
| it's not backward compatible.  That needs to be a 2.x revision, if it 
| happens at all.

The WSGI middleware components that actually create their own environ
are few and far between.  This is an uncommon edge case.

| Ian presented 2 use cases in paste where a different environ
| is passed down the stack, however, both of his cases can be fixed (as I
| demonstrated) to be compliant with the suggested wording above.
| 
| So we can make it harder for people to write middleware, in order to 
| make it easier for people to introduce global coupling?  That doesn't 
| sound like a useful tradeoff -- certainly not one that overcomes the 
| cost of changing the spec.


The change needed is trivial and minor compared to most other things you
have to get correct while writing WSGI middleware; and given the
relative immaturity of WSGI at this point (especially in edge cases like
this), I doubt it is the problem that you make it out to be.

In summary; I think that a response-headers approach as proposed by
Phillip is the best (but higher overhead) approach.  However, I disagree
that some sort of extension API is preferable to just keeping the
``environ`` constant 

Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Stephan Richter
On Sunday 22 January 2006 11:34, Phillip J. Eby wrote:
 Is Zope the only WSGI application that performs authentication
 itself?

 I think Zope is the only WSGI application that cares about communicating
 this information back to the web server's logs.  :)  Or at least, the only
 one whose author has said so.  :)

Well, I originally worked with Itamar and James on the Twisted integration 
into Zope 3, when we noticed this problem.

 Perhaps an X-Authenticated-User: foo header could be added in a future
 spec version?  (And as an optional feature in the current PEP.)  This seems
 a simpler way to incorporate the feature than adding an extension API to
 environ.

 We considered and even implemented originally suggestions you made, but 
considered it a security problem and dismissed it. And a convention is not 
really a viable solution either, since it defeats the point of a non-specific 
API, like WSGI.

We thought about the problem quiet a bit and decided that the user is really 
the only thing that the log really has to know from the application. So a 
simple callback that expects a simple string would be just fine.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics  Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Clark C. Evans
I'm using paste.auth.* modules, and they fill-in environ['REMOTE_USER']
with the authenticated user.  I then use this information in later
processing stages and it works nicely for me and is quite simple.

On Sun, Jan 22, 2006 at 03:24:52PM -0600, Ian Bicking wrote:
| So if the WSGI environ that the middleware sees initially is the same 
| environ that the authenticator writes too, then the middleware will
| see that change on the way out and include it.

For this case, I would imagine that a good transaction logger 
would come *before* the authentication middleware, stuffing away 
the ``environ`` and then hook into the ``start_response()`` callback 
to actually log the transaction (including the ``REMOTE_USER``) when
the response is created.  I don't see how a header would help here.

| Using a header would solve the problem where the environment is 
| completely changed (unlikely), or copied before REMOTE_USER is 
| assigned (fairly likely).

Ok.  If you are completely changing the environment, you should
just copy it and sent the copy on, so let us address these two cases
together.  In this situation, you also have to assume that the
authentication middleware happens *after* the request re-write or
you're in the situation described above (the logger can get the
REMOTE_USER).  I can picture two use-cases for this situation:

  Your server is doing a internal redirect to a sub-application
  that needs its own authentication.  In this case, why not just
  do an external redirect?

  Your server is doing N sub-requests, some of which require their
  own authentication, and assembling the results into a single
  response.  In this case, you'll need your own custom logging
  mechanism anyway... and I cannot imagine the complexity of 
  having N sub-branches that might return a 401.

In short, I can't think of any generic use-cases for this second
scenerio (where authentication happens *after* a complete re-write
of the environ) that would work with a generic request logging;
and I don't see how a header would help.

Perhaps I'm missing something?

Best,

Clark
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Phillip J. Eby
At 12:42 PM 1/23/2006 -0500, Clark C. Evans wrote:
In short, I can't think of any generic use-cases for this second
scenerio (where authentication happens *after* a complete re-write
of the environ) that would work with a generic request logging;
and I don't see how a header would help.

Perhaps I'm missing something?

You simply can't use environ values to communicate *up* the WSGI stack, 
since at no level is it guaranteed you have the same 
dictionary.  Response headers and callables (or mutables) in the environ 
are the only way to send stuff upstream.  You also have to be careful that 
any upstream communication doesn't bypass something that middleware should 
be allowed to control.

In the case of authentication, it should be sufficient to have a callable 
or mutable in the environ that can be called or set more than once per 
request, i.e. it only takes effect once the request is completed.  This 
allows outer middleware to override what inner middleware or the 
application set it to.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Phillip J. Eby
At 02:52 PM 1/23/2006 -0500, Clark C. Evans wrote:
On Mon, Jan 23, 2006 at 02:25:35PM -0500, Phillip J. Eby wrote:
| You simply can't use environ values to communicate *up* the WSGI stack,
| since at no level is it guaranteed you have the same
| dictionary.

The same could be said for response headers, no?  You've got a WSGI
stack of A, B, and C.  Just beacuse C sets a header intended for A,
doesn't mean that B has to pass it on.

That's a feature, not a bug.  However, the presumption is that middleware 
will in general pass through the same *set* of environ variables or 
response headers as it received, unless it has a reason to modify 
them.  This does not require middleware to pass the same 'environ' or 
'header' *objects*, just that in general they should pass through the 
*contents*.


| In the case of authentication, it should be sufficient to have a
| callable or mutable in the environ that can be called or set more than
| once per request, i.e. it only takes effect once the request is
| completed.  This allows outer middleware to override what inner
| middleware or the application set it to.

This is exactly what environ['REMOTE_USER'] is, a mutable value in
the environ that can be set more than once,

Strings aren't mutable.


and only the current
value matters when create_response hits the request log middleware.

The current value in *which* environ?  The application doesn't necessarily 
have the same environ object as the server, so modifying it will make no 
difference to anything.


| Response headers and callables (or mutables) in the environ
| are the only way to send stuff upstream.  You also have to be careful
| that any upstream communication doesn't bypass something that middleware
| should be allowed to control.

Of course you have to be careful and work out a protocol that all
intermediate middleware components agree upon.  However, beyond that
I fail to understand the distinctions you're making or why they
are important.  Perhaps a tangable example would help to educate me?

Middleware is not required to pass the same environ to a child application 
that it received from its parent server, and environ objects are not 
returned to the caller.  Ergo, modifying 'environ' itself (as opposed to 
modifying an object *in* the environment), cannot guarantee that the server 
will see the change unless middleware specifically conspires to make this 
so.  This is the opposite of the way it should work, which is that it 
should be communicated to the server unless the middleware specifically 
conspires to prevent it (e.g. by stripping the environ entry that allows 
the communication, or by changing the value before returning).

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Phillip J. Eby
At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
Specify a new environment variable called 'wsgi.user' (or something similar)
that is a mutable and can be written several times. Only the last write
(before the output is sent) is important. By default the variable is set to
``None`` for not set.

I'd suggest a callable under 'wsgi.log_username', that takes one argument.

It should be specified whether it requires ASCII or Unicode.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Stephan Richter
On Monday 23 January 2006 16:15, Phillip J. Eby wrote:
 I'd suggest a callable under 'wsgi.log_username', that takes one argument.

Sounds good to me.

 It should be specified whether it requires ASCII or Unicode.

I don't care; I think ASCII is fine; we can have the application handle the 
encoding.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics  Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Ian Bicking
Clark C. Evans wrote:
 Thanks Phillip!  
 
 This clears it up for me. Although, I disagree with the 
 specification in this case; there does not seem to be a
 reason why middleware shouldn't be required to send the
 *same* environ dict along in subsequent calls. 

Paste already does this, for the N subrequest method.  This is done at 
least in paste.cascade, where we retry the request several times until 
something responds with a non-404.  Since it is common at least for 
subapplications to rewrite SCRIPT_NAME/PATH_INFO, you can't pass later 
objects the same dictionary as previuos objects.  Well, I suppose you 
could update the one-and-only environ from a copy you made before 
sending the request on.  But anyway, it doesn't do that.

I'd like to do this same thing (N subrequests) sometime in the future 
for server-side HTML Overlays.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Clark C. Evans
On Mon, Jan 23, 2006 at 04:15:06PM -0500, Phillip J. Eby wrote:
| At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
|  Specify a new environment variable called 'wsgi.user' (or something 
|  similar) that is a mutable and can be written several times. Only 
|  the last write (before the output is sent) is important. By default
|  the variable is set to ``None`` for not set.

Why not ``wsgi.context`` or something like that which defaults to 
an empty dictionary.  Then you can put what ever you want in it; 
``wsgi.user`` just seems to be a bit too specific.

| I'd suggest a callable under 'wsgi.log_username', that takes one 
| argument.

I think this is way too specific; it doesn't address the general
problem: how do you pass information back up the middleware stack.

| It should be specified whether it requires ASCII or Unicode.

Why cannot it just accept a Python string?  You can always check
if it is Unicode or not.

Best,

Clark
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-23 Thread Clark C. Evans
I'm not convinced that we shouldn't just require WSGI middleware
to forward on the *exact* same ``environ`` as it receives.

On Mon, Jan 23, 2006 at 03:29:32PM -0600, Ian Bicking wrote:
| Paste already does this, for the N subrequest method.  This is done at 
| least in paste.cascade, where we retry the request several times until 
| something responds with a non-404.

Yes; this is exactly the sort of edge cases that I think will elude just
about any general solution.  How would Phillip's recent suggestion,
for example, a ``wsgi.log_username`` work in this situation?

Assertion:

  If a WSGI middleware component _isn't_ passing on the actual
  ``environ`` given by its parent, then it is an edge case where
  this problem can't be solved anyway.

| I suppose you could update the one-and-only environ from a copy 
| you made before sending the request on.  But anyway, it doesn't do that.

Yes, you could for this case _copy_ the ``environ`` and then when
one of the cascade applications returns, you can update the 
original ``environ`` with the saved copy. 

Suggested Wording:

   A WSGI Middleware component (that is, one that receives a 
   request and forwards it on to another component) must forward 
   on the *exact* same ``environ`` dict that it received.

| I'd like to do this same thing (N subrequests) sometime in the future 
| for server-side HTML Overlays.

The above restriction won't hurt these use-cases (which you must
be careful about anyway), and it addresses the current issue:
how does one pass information back up the call chain.

Best,

Clark
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Phillip J. Eby
At 11:22 AM 1/22/2006 -0500, Jim Fulton wrote:
Typically, web servers provide access logs that include a label
for the authenticated user.

Often, WSGI applications (or middleware) provide their own user
authentication facilities.  Well, Zope does. :)

There doesn't seem to be a standard way for WSGI applications or
middleware to communicate the information necessary for a server
to log the authenticated user back to the server.

Am I missing something?  How do other people handle this?

Is Zope the only WSGI application that performs authentication
itself?

I think Zope is the only WSGI application that cares about communicating 
this information back to the web server's logs.  :)  Or at least, the only 
one whose author has said so.  :)

Perhaps an X-Authenticated-User: foo header could be added in a future 
spec version?  (And as an optional feature in the current PEP.)  This seems 
a simpler way to incorporate the feature than adding an extension API to 
environ.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Jim Fulton
Phillip J. Eby wrote:
 At 11:22 AM 1/22/2006 -0500, Jim Fulton wrote:
 
 Typically, web servers provide access logs that include a label
 for the authenticated user.

 Often, WSGI applications (or middleware) provide their own user
 authentication facilities.  Well, Zope does. :)

 There doesn't seem to be a standard way for WSGI applications or
 middleware to communicate the information necessary for a server
 to log the authenticated user back to the server.

 Am I missing something?  How do other people handle this?

 Is Zope the only WSGI application that performs authentication
 itself?
 
 
 I think Zope is the only WSGI application that cares about communicating 
 this information back to the web server's logs.  :)

I hope that's not true.  Certainly, if anyone else is doing authentication
in their applications or middleware, they *should* care about getting
information into the access logs.

  Or at least, the
 only one whose author has said so.  :)

Please, someone else speak up. :)


 Perhaps an X-Authenticated-User: foo header could be added in a future 
 spec version?  (And as an optional feature in the current PEP.) 

Perhaps. Note that it should be clear that this is soley for use
in the access log.  There should be no assumption that this is
a principal id or a login name.  It is really just a label for the
log.  To make this clearer, I'd use something like:
X-Access-User-Label: foo.

  This
 seems a simpler way to incorporate the feature than adding an extension 
 API to environ.

Why is that?  Isn't the env meant for communication between the WSGI
layers?  I'm not sure I'd want to send this information back to the browser.

Jim

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Alan Kennedy
[Jim Fulton]
 Is Zope the only WSGI application that performs authentication
 itself?

[Phillip J. Eby]
 I think Zope is the only WSGI application that cares about
  communicating this information back to the web server's logs.  :)

[Jim Fulton]
  I hope that's not true.  Certainly, if anyone else is doing
  authentication in their applications or middleware, they
  *should* care about getting information into the access logs.

Well, Apache records auth info in logs as well, and it seems like a 
perfectly reasonable thing for a server to do .

http://httpd.apache.org/docs/2.0/logs.html#accesslog

[Phillip J. Eby]
  Perhaps an X-Authenticated-User: foo header could be added
  in a future spec version?  (And as an optional feature in the
  current PEP.)

[Jim Fulton]
  Perhaps. Note that it should be clear that this is soley for use
  in the access log.  There should be no assumption that this is
  a principal id or a login name.  It is really just a label for the
  log.  To make this clearer, I'd use something like:
  X-Access-User-Label: foo.

Sending X-headers seems hacky, and results in unnecessary information 
being transmitted back to the user (possibly revealing sensitive 
information, or opening security holes?)

I think that the communication mechanism for auth information is 
possibly best served by a simple convention between auth middleware 
authors. Perhaps servers that are aware that auth middleware is in use 
can put a callable into the WSGI environment, which auth middleware 
calls when it has auth'ed the user?

[Phillip J. Eby]
  This seems a simpler way to incorporate the feature than adding
  an extension API to environ.

[Jim Fulton]
  Why is that?  Isn't the env meant for communication between
  the WSGI layers?  I'm not sure I'd want to send this information
  back to the browser.

I think an API could be very simple, and optional for servers that know 
they won't be logging auth information.

I agree about not sending this information back to the user: it's 
unnecessary and potentially dangerous.

Regards,

Alan Kennedy.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Jim Fulton
Phillip J. Eby wrote:
 At 05:45 PM 1/22/2006 +, Alan Kennedy wrote:
 
I agree about not sending this information back to the user: it's
unnecessary and potentially dangerous.
 
 
 Yep, it would be really dangerous to let me know who I just logged in to an 
 application as.  I might find out who I really am! ;)

The point is that there's really no reason to send this to the client.
It is certainly conceivable that some app could consider this
information sensitive.

Jim

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Alan Kennedy
[Alan Kennedy]
 I agree about not sending this information back to the user: it's
 unnecessary and potentially dangerous.

[Phillip J. Eby]
 Yep, it would be really dangerous to let me know who I just logged in to 
 an application as.  I might find out who I really am! ;)

Very droll ;-)

What if other information, such as meta-information about the auth 
directory or database in which the credentials were looked up, was also 
communicated through X-headers, e.g. server connection details, etc.

Happy for that to go back to the user too?

If X-headers are to be used in WSGI, I think there should be something 
in the spec about whether or not they should be transmitted to the user.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Jim Fulton
Ian Bicking wrote:
 Jim Fulton wrote:
 
 Typically, web servers provide access logs that include a label
 for the authenticated user.

 Often, WSGI applications (or middleware) provide their own user
 authentication facilities.  Well, Zope does. :)

 There doesn't seem to be a standard way for WSGI applications or
 middleware to communicate the information necessary for a server
 to log the authenticated user back to the server.

 Am I missing something?  How do other people handle this?

 Is Zope the only WSGI application that performs authentication
 itself?
 
 
 I do the authentication in my apps,

Cool.

  but I am sloppy and do not record it
 ;)  Well, that's not completely true.  In the rough access logger in 
 Paste (http://pythonpaste.org/paste/translogger.py.html?f=8l=80#8) I 
 include environ['REMOTE_USER'] if it is present.   So if the WSGI environ
 that the middleware sees initially is the same environ that the 
 authenticator writes too, then the middleware will see that change on 
 the way out and include it.  Using a header would solve the problem 
 where the environment is completely changed (unlikely), or copied before 
 REMOTE_USER is assigned (fairly likely).
 
 I can imagine a convention of X-WSGI-Authenticated, where X-WSGI-* gets 
 stripped by the server,

Works for me.

  and any middleware that is interested can watch
 for these headers.  Another option is a callback, but potentially 
 multiple middleware's will be interested (multiple logs isn't hard to 
 imagine), and that complicates the callback.

I think just scribbling a value into the env or headers is fine.

JIm

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com