get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Graham Dumpleton

That get_session() was being added at the C code level was also one of
the things that worried me about this change. Thus in part why my
suggestion for an alternate approach didn't touch the request object
at all.

While we are talking about this issue of additions to the request 
object,

the documentation says:

  The request object is a Python mapping to the Apache request_rec 
structure.
  When a handler is invoked, it is always passed a single argument - 
the request object.


  You can dynamically assign attributes to it as a way to communicate 
between handlers.


The bit I want to highlight is the last line about handlers dynamically
assigning attributes to the request object.

Being able to do this is quite useful and is already used within
mod_python.publisher. Specifically, it assigns the instance of the
FieldStorage object to req.form. This can then be accessed by the
published function.

As I describe in:

  https://issues.apache.org/jira/browse/MODPYTHON-38

I want to formalise a couple of current conventions regarding attributes
being assigned to the request object by handlers. I have mentioned 
req.form.

The other I want to formalise is req.session.

Thus I want a documented convention that if a handler is going to use
util.FieldStorage, that it should before doing so, first check whether
an existing instance resides as req.form and use that instead.

Similarly, if a handler is going to create a Session object, that it
look for an existing instance as req.session and again use that instead.

Both mod_python.psp and mod_python.publisher would be modified to follow
the convention, which would then avoid the problems which sometimes
come up when people try and use the two together. Ie., both trying to
parse form arguments, or both trying to create session objects and
locking each other out.

Is there support for doing this? If so I'll up it on my priority list.

Note, this isn't addressing some of what the get_session() changes
were intended to address, specifically issues of internal redirects,
but I think it is a good start to at least address this, with any
final solution building around this convention.

Graham

On 13/03/2006, at 2:39 PM, Gregory (Grisha) Trubetskoy wrote:



I'm -1 on get_session() too. The request object is supposed to be a 
representation of Apache's request, and get_session() just does not 
belong there.


Grisha

On Sun, 12 Mar 2006, Jim Gallacher wrote:

I handn't really intended to start working on an implementation. I 
just don't like seeing all those issues in JIRA marked as unassigned. 
:) Since I created it I figured I should take some responsibility for 
it. Plus, it's a gentle reminder when I list my assigned issues - 
resolve it one way or another.


I still think we need some sort of solution to the problem of people 
trying to create 2 session instances in the same request, but I agree 
that the original concept of req.get_session() was not quite right.


Jim

Graham Dumpleton wrote:

I would rather we not go ahead with adding req.get_session() at
this time. At least not how it was envisaged to be done previously.
I'll come back with a bit of analysis after I review where we were
up to previously.
Graham
On 12/03/2006, at 8:47 AM, Jim Gallacher (JIRA) wrote:

 [ http://issues.apache.org/jira/browse/MODPYTHON-59?page=all ]
Jim Gallacher reassigned MODPYTHON-59:
--
Assign To: Jim Gallacher

Add get_session() method to request object
--
 Key: MODPYTHON-59
 URL: http://issues.apache.org/jira/browse/MODPYTHON-59
 Project: mod_python
Type: New Feature
  Components: session
Versions: 3.1.4, 3.1.3, 3.2.7
 Environment: All
Reporter: Jim Gallacher
Assignee: Jim Gallacher
 Fix For: 3.3
 Attachments: Session.py.diff.txt
Users will get session instances by calling req.get_session(). If 
a session already exists it will be returned, otherwise a new 
session instance will be created. Session configuration will be 
handled using apache directives rather than within their code.
Using this scheme means only one session instance will be created 
per request, which will eliminate the deadlock problems many 
people experience. Also, using this scheme makes it possible for 
sessions to be properly handled within psp pages and across 
req.internal_redirect() calls.

Code will be commited to svn shortly.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the 
administrators:

   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira






Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Gregory (Grisha) Trubetskoy


On Mon, 13 Mar 2006, Graham Dumpleton wrote:


Thus I want a documented convention that if a handler is going to use
util.FieldStorage, that it should before doing so, first check whether
an existing instance resides as req.form and use that instead.


I'm not sure if this is a good example - req.form is something specific to 
the publisher. Rather than perhaps documenting it as you suggest, 
util.FieldStorage can take it upon itself to create a req.form so that 
subsequent attempts to instantiate it just return req.form. (This is just 
an example, I'm not 100% sure that I having FS do this makes sense - seems 
like a good idea).



Similarly, if a handler is going to create a Session object, that it
look for an existing instance as req.session and again use that instead.


OR, the Session module would know to look for a req.session, in which case 
the handlers wouldn't need to worry about it.


(One thing to watch out for would be that mutliple concurrent sessions in 
the same request is a possibility)


Grisha


Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Graham Dumpleton
Grisha wrote ..
 
 On Mon, 13 Mar 2006, Graham Dumpleton wrote:
 
  Thus I want a documented convention that if a handler is going to use
  util.FieldStorage, that it should before doing so, first check whether
  an existing instance resides as req.form and use that instead.
 
 I'm not sure if this is a good example - req.form is something specific
 to 
 the publisher. Rather than perhaps documenting it as you suggest, 
 util.FieldStorage can take it upon itself to create a req.form so that
 subsequent attempts to instantiate it just return req.form. (This is just
 an example, I'm not 100% sure that I having FS do this makes sense - seems
 like a good idea).
 
  Similarly, if a handler is going to create a Session object, that it
  look for an existing instance as req.session and again use that instead.
 
 OR, the Session module would know to look for a req.session, in which case
 the handlers wouldn't need to worry about it.
 
 (One thing to watch out for would be that mutliple concurrent sessions
 in the same request is a possibility)

Hmmm, having a look at the code, at some point the check for req.session
got added and I didn't realise or forgot that it had been done.

# does this code use session?
session = None
if session in code.co_names:
if hasattr(req, 'session'):
session = req.session
else:
session = Session.Session(req)

It didn't get added for form though, which means that accessing form
arguments from within a PSP page will mean only those in the query
string of the URL will be available as the content of the request has
already been consumed.

Looks like a audit of both:

  https://issues.apache.org/jira/browse/MODPYTHON-38
  https://issues.apache.org/jira/browse/MODPYTHON-59

need to be done to work out what has and hasn't been done related
to this so we know where we are up to.

Looks a bit like when the req.get_session() changes got rolled back that
it got introduced at that point:

  
http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=226320view=diffr1=226320r2=226319p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py

Before the req.get_session() change it didn't exist:

  
http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=191745view=diffr1=191745r2=191744p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py

I can't remember if this was a conscious decision to check for req.session
based on suggestions or otherwise.

Graham


Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Jim Gallacher

Graham Dumpleton wrote:

Grisha wrote ..


On Mon, 13 Mar 2006, Graham Dumpleton wrote:



Thus I want a documented convention that if a handler is going to use
util.FieldStorage, that it should before doing so, first check whether
an existing instance resides as req.form and use that instead.


I'm not sure if this is a good example - req.form is something specific
to 
the publisher. Rather than perhaps documenting it as you suggest, 
util.FieldStorage can take it upon itself to create a req.form so that

subsequent attempts to instantiate it just return req.form. (This is just
an example, I'm not 100% sure that I having FS do this makes sense - seems
like a good idea).



Similarly, if a handler is going to create a Session object, that it
look for an existing instance as req.session and again use that instead.


OR, the Session module would know to look for a req.session, in which case
the handlers wouldn't need to worry about it.

(One thing to watch out for would be that mutliple concurrent sessions
in the same request is a possibility)


The problem is that you are still depending on a naming convention which 
requires 2 things from users: they read the docs and they adopt the 
convention. Both are losing propostions, IMHO. Heck, *I* don't read the 
docs. :)


The idea of something like req.get_session() is to give users an obvious 
way to grab a session object without the deadlock concerns. How many 
times have we seen this kind of problem-code on the mailing list?


def index(req):
sess = Session.Session(req)
do_stuff(req)
...

def do_stuff(req):
sess = Session.Session(req)
... do other stuff.

Having the session constructor check for the existence of req.session is 
of no use here. We need a way to make sure only *one* session instance 
is created per request. (Bonus marks for making it work with internal 
redirect).




Hmmm, having a look at the code, at some point the check for req.session
got added and I didn't realise or forgot that it had been done.

# does this code use session?
session = None
if session in code.co_names:
if hasattr(req, 'session'):
session = req.session
else:
session = Session.Session(req)

It didn't get added for form though, which means that accessing form
arguments from within a PSP page will mean only those in the query
string of the URL will be available as the content of the request has
already been consumed.


It didn't get added for form handling because I was fixing the session 
code. ;)  To me the session problem is the more serious of the two. Mess 
up your form code and you have to rewrite your code. This is 
inconvenient. Mess up your session code and you deadlock your server. 
This is very bad.



Looks like a audit of both:

  https://issues.apache.org/jira/browse/MODPYTHON-38
  https://issues.apache.org/jira/browse/MODPYTHON-59

need to be done to work out what has and hasn't been done related
to this so we know where we are up to.

Looks a bit like when the req.get_session() changes got rolled back that
it got introduced at that point:

  
http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=226320view=diffr1=226320r2=226319p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py

Before the req.get_session() change it didn't exist:

  
http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=191745view=diffr1=191745r2=191744p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py

I can't remember if this was a conscious decision to check for req.session
based on suggestions or otherwise.

Graham


Although the psp.py code in question was put in when I was messing 
around with get_session, it is not an artifact of get_session being 
incompletely rolled back. Ignoring the whole get_session stuff for a 
moment, the diff would look like this:


Index: psp.py
===
--- psp.py  (revision 164956)
+++ psp.py  (revision 226320)
@@ -190,7 +190,10 @@
 # does this code use session?
 session = None
 if session in code.co_names:
-session = Session.Session(req)
+if hasattr(req, 'session'):
+session = req.session
+else:
+session = Session.Session(req)


This new code helps to avoid the deadlock problem if the user had the 
foresight to stuff their session object into req. The biggest problem 
with this change is not it's validity, but the fact that it was not 
documented.


Anyway, the get_session code in requestobject.c was half-baked and 
should not have been checked into trunk. Furthermore, when it was 
decided to defer this issue to 3.3, I should have expunged it completely 
rather than leaving it as a stub. I guess at the time I 

Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Graham Dumpleton
Jim Gallacher wrote ..
 The idea of something like req.get_session() is to give users an obvious
 way to grab a session object without the deadlock concerns. How many 
 times have we seen this kind of problem-code on the mailing list?
 
 def index(req):
  sess = Session.Session(req)
  do_stuff(req)
  ...
 
 def do_stuff(req):
  sess = Session.Session(req)
  ... do other stuff.
 
 Having the session constructor check for the existence of req.session is
 of no use here. We need a way to make sure only *one* session instance
 is created per request. (Bonus marks for making it work with internal 
 redirect).

FWIW, I use the following in a class based authenhandler.

thread = threading.currentThread()

if self.__cache.has_key(thread):
req.session = self.__cache[thread]
else:
req.session = Session.Session(req)

self.__cache[thread] = req.session
def cleanup(data): del self.__cache[thread]
req.register_cleanup(cleanup)

In short, store it in a cache outside of the request object keys by
the thread ID.

Works for both internal redirects and also for fast redirects as used by
DirectoryIndex matching algorithm, although for the latter there are
other issues as I have documented in MODPYTHON-146.

My thinking keeps changing a bit, but overall I have been leaning
towards not having a get_session() like function explicitly provided and
instead documenting how to correctly write a authenhandler which can
handle form based login with sessions. Ie., use Apache phases properly
rather than pushing authentication into content handler phase as most
do. Unfortunately, I keep finding things in mod_python that need to
be improved or fixed to avoiding having to fudge things, thus haven't
presented my code for others to look at yet. :-(

Graham


Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Jim Gallacher

Graham Dumpleton wrote:

Jim Gallacher wrote ..


The idea of something like req.get_session() is to give users an obvious
way to grab a session object without the deadlock concerns. How many 
times have we seen this kind of problem-code on the mailing list?


def index(req):
sess = Session.Session(req)
do_stuff(req)
...

def do_stuff(req):
sess = Session.Session(req)
... do other stuff.

Having the session constructor check for the existence of req.session is
of no use here. We need a way to make sure only *one* session instance
is created per request. (Bonus marks for making it work with internal 
redirect).



FWIW, I use the following in a class based authenhandler.

thread = threading.currentThread()

if self.__cache.has_key(thread):

req.session = self.__cache[thread]
else:
req.session = Session.Session(req)

self.__cache[thread] = req.session

def cleanup(data): del self.__cache[thread]
req.register_cleanup(cleanup)

In short, store it in a cache outside of the request object keys by
the thread ID.

Works for both internal redirects and also for fast redirects as used by
DirectoryIndex matching algorithm, although for the latter there are
other issues as I have documented in MODPYTHON-146.

My thinking keeps changing a bit, but overall I have been leaning
towards not having a get_session() like function explicitly provided and
instead documenting how to correctly write a authenhandler which can
handle form based login with sessions. Ie., use Apache phases properly
rather than pushing authentication into content handler phase as most
do. 


Which is all good, but you are assuming that people are only using 
sessions for authentication purposes. Consider a shopping cart 
implemented as  session: the user may not be authenticated until *after* 
they have filled their cart and are ready to checkout. Perhaps the cache 
code would be better off in Session.py?


Jim





Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Graham Dumpleton
Jim Gallacher wrote ..
 Which is all good, but you are assuming that people are only using 
 sessions for authentication purposes. Consider a shopping cart 
 implemented as  session: the user may not be authenticated until *after*
 they have filled their cart and are ready to checkout. Perhaps the cache
 code would be better off in Session.py?

In the case where a session is required across both public and private
areas of a web site where login is required for the private area, the
example I am working on handles that.

I am still working out what presents as being the more sensible way, but
Apache configuration would be something like:

  AuthType Session::Private
  Require valid-user
  PythonAuthenHandler _sessionmanager

  Files login.html
  AuthType Session::Public
  /Files

or:

  AuthType Session
  AuthName Private Area
  Require valid-user
  PythonAuthenHandler _sessionmanager

  Files login.html
  AuthName Public Area
  /Files

The benefit of using an authenhandler in this case is that it doesn't
particularly matter what is being used for the content handler phase as
longs as access to resources can be controlled based on filename matched
by Apache for the URL, or possibly by the URL itself. Thus, you could be
using publisher or PSP, or even a mix. Whatever is used, it just uses the
req.session object created for it by the authenhandler.

Sure the code might not handle every single possible use case, but its
purpose was an example, not something to be embodied into any package.
Thus, it could by all means be customised. The important thing is
illustrates how to do all the hard stuff that people don't generally get
correct.

Graham


Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Gregory (Grisha) Trubetskoy


On Mon, 13 Mar 2006, Jim Gallacher wrote:

The idea of something like req.get_session() is to give users an obvious way 
to grab a session object without the deadlock concerns. How many times have 
we seen this kind of problem-code on the mailing list?


def index(req):
   sess = Session.Session(req)
   do_stuff(req)
   ...

def do_stuff(req):
   sess = Session.Session(req)
   ... do other stuff.

Having the session constructor check for the existence of req.session is of 
no use here. We need a way to make sure only *one* session instance is 
created per request. (Bonus marks for making it work with internal redirect).


[sorry, i only read the beginning of the message, so i might be not fully 
understanding]


Session.Session is not a constructor, just a function. But also, if it 
were, I think this can be solved with the new object's __new__() ?


Grisha


Re: get_session(), req.session, req.form and MODPYTHON-38

2006-03-13 Thread Jim Gallacher

Gregory (Grisha) Trubetskoy wrote:


On Mon, 13 Mar 2006, Jim Gallacher wrote:

The idea of something like req.get_session() is to give users an 
obvious way to grab a session object without the deadlock concerns. 
How many times have we seen this kind of problem-code on the mailing 
list?


def index(req):
   sess = Session.Session(req)
   do_stuff(req)
   ...

def do_stuff(req):
   sess = Session.Session(req)
   ... do other stuff.

Having the session constructor check for the existence of req.session 
is of no use here. We need a way to make sure only *one* session 
instance is created per request. (Bonus marks for making it work with 
internal redirect).



[sorry, i only read the beginning of the message, so i might be not 
fully understanding]


Session.Session is not a constructor, just a function. But also, if it 
were, I think this can be solved with the new object's __new__() ?


You're right, I misspoke. It is a function, but it does return a new 
session instance so there is a constructor in there somewhere. ;)


Jim