[modwsgi] Re: Converting from modpython's PythonAccessHandler to modwsgi

Deron Meranda Mon, 15 Mar 2010 10:15:20 -0700

On Mar 12, 10:56 pm, Graham Dumpleton <[email protected]>
wrote:
> On 13 March 2010 07:06, Deron Meranda <[email protected]> wrote:
> > In particular I use the PythonAccessHandler hook to do all kinds
> > of authentication/authorization stuff.
>
> Hmmm, one shouldn't be using the access handler for authentication and
> authorization. That is what the authentication and authorization hooks
> are meant to be for. :-)


Yes.  However the authn/authz hooks only get invoked when one
also uses the Apache Require directive.  And that semantically
seems too mismatched for my environment, which has much more
complex access control rules.  I don't use any of the standard
HTTP authentication methods.

I'm aware that this means I'm stepping outside of the Apache
authn/authz model, but just slightly so.

The Access handler on the other hand is always called, and it
still happens early enough (before the content handler).  And
for the few content handlers that need it, I can still synthesize
(or simulate) normal access control by pushing values into
req.user for example.

For me, I just found it much more convenient to put everything
security-related (authentication and access control) into the
access handler; and it will either let requests pass on up to
the content handlers, or return 403's (or occasionally 503's)
And for the subset of content handlers that are written in
Python, it can also pass additional information up to them;
mostly in the form of Python objects.

Oh, I should mention that I've also used stacked handlers
in a few cases (listing multiple python handlers in the
Python*Handler directives).  I would assume that I definitely
can't do that, for the same apache phase anyway, once I
start mixing mod_python and mod_wsgi.

[Technically I guess there is the Apache handler chain,
and the the mod_python handler stack (each with slightly
different semantics in terms of handling DEFER, etc.).
So would it be possible for a mod_python handler (stack) and
a mod_wsgi handler to be on the same Apache handler chain?
And if so in which order would they be executed?  Not
that I think I need to do that, but I'm curious.]


I also in a few cases have used mod_python's multiple
interpreter feature; where I wanted an extra level of
separation to help prevent leakage of information between
python environments.  That though is not strictly necessary,
so I could easily enough transition to using a single
python interpreter if needed.

Even if mod_python and mod_wsgi can somewhat coexist (with the
caveat that mod_python gets started first so that modwsgi
doesn't get to do some initialization) --- are there any
issues if you use multiple python interpreters?  Though I
take it that even the Python folks (Guido) aren't too keen
on keeping the multiple interpreter support any more.


> In mod_wsgi, so long as using mod_wsgi 3.X, you can use thread local
> storage to preserve data between phases, albeit that the WSGI
> application must be running in embedded mode and in the same
> interpreter. The first phase in a request would always need to make
> sure it cleared out any data hanging over from a prior request.

Makes sense.  Though I currently don't use any threading
model, because as you've mentioned it has been arguably broken
in mod_python (at least as of the time in history when I wrote
most of my framework); and also threading causes some grief in
some other dependencies of mine.  So I've been happy enough with
the multi-process model.  And the few cases where I had to have
thread support (e.g., Xapian), I've managed to use a process
proxy approach to get those things outside of the Apache
processes -- which isn't necessarily a bad thing, because then
it lets me use SELinux to compartmentalize things even more.

But yes, I had assumed that the embedded mode would be needed
if python objects were to be shared directly across phases.


> You can access the request object directly and put data in
> req.subprocess_env and req.notes if you use apswigpy (SWIG bindings
> for Apache). Stuffing values in req.subprocess_env means they will
> show up in WSGI environ dictionary of subsequent phase and will also
> make it across to WSGI application itself if run in daemon mode.
>
> Rather than use apswigpy, you could always write a custom Python
> extension module to allow you to stuff values in same.

Great, I already do use req.subprocess_env for a little bit now;
though most of my complex objects just get stuffed into the req
object directly as additional non-standard members.  I assume I
don't need to touch the apswigpy to continue to use the
subprocess_env, do I?

If the subprocess_env does end up in the WSGI environment, then
that seems like the safest and most future-proof approach to
transitioning my content handlers to mod_wsgi.  Though I can't
pass "live" objects so I'll have to serialize everything.  That
should be straight forward enough, but perhaps not quite as efficient.

This will though mean that my new content handlers (in wsgi) won't
be using the exact same python objects as my access handler; so
some extra things may not work.  For example my access handler
does a bit of extra safety checking, such as making sure that
there aren't any database transactions that are accidentally
left in-progress and span more than one http request.  Also the
access handler also chooses which database account(s) to use for
each HTTP request (thus also allowing the database to enforce
its own internal access control rules too).  Basically my
access handler makes most of the security decisions, and the
content handler just focuses on the content.

That kind of logic though could be moved up into the content
handler phase, with some additional framework. Though it
also means that the remainder of my access control logic and
my content logic will necessarily have to each operate within
separate database transaction scopes since they'll no longer
be able to share the same python interpreter.


> By mixing mod_python and mod_wsgi you do loose some configuration
> control over mod_wsgi because mod_python will hijack the Python
> interpreter initialisation and so mod_wsgi can do any setup which has
> to be done before that initialisation.

Are the specifics of this interaction documented anywhere?


Also, this may be more of a question for the mod_python list,
but since the development of mod_python is pretty much in a
stable leave-it-alone mode; do you foresee anything coming
where mod_wsgi and mod_python diverge so much that they will not
be coexistent in the future (or that mod_python could be
dropped entirely?)  Or what about Python 3.x, as I'm still
running with Python 2.x?

Thanks again,
Deron Meranda

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

[modwsgi] Re: Converting from modpython's PythonAccessHandler to modwsgi

Reply via email to