get_session(), req.session, req.form and MODPYTHON-38
That get_session() was being added at the C code level was also one of the things that worried me about this change. Thus in part why my suggestion for an alternate approach didn't touch the request object at all. While we are talking about this issue of additions to the request object, the documentation says: The request object is a Python mapping to the Apache request_rec structure. When a handler is invoked, it is always passed a single argument - the request object. You can dynamically assign attributes to it as a way to communicate between handlers. The bit I want to highlight is the last line about handlers dynamically assigning attributes to the request object. Being able to do this is quite useful and is already used within mod_python.publisher. Specifically, it assigns the instance of the FieldStorage object to req.form. This can then be accessed by the published function. As I describe in: https://issues.apache.org/jira/browse/MODPYTHON-38 I want to formalise a couple of current conventions regarding attributes being assigned to the request object by handlers. I have mentioned req.form. The other I want to formalise is req.session. Thus I want a documented convention that if a handler is going to use util.FieldStorage, that it should before doing so, first check whether an existing instance resides as req.form and use that instead. Similarly, if a handler is going to create a Session object, that it look for an existing instance as req.session and again use that instead. Both mod_python.psp and mod_python.publisher would be modified to follow the convention, which would then avoid the problems which sometimes come up when people try and use the two together. Ie., both trying to parse form arguments, or both trying to create session objects and locking each other out. Is there support for doing this? If so I'll up it on my priority list. Note, this isn't addressing some of what the get_session() changes were intended to address, specifically issues of internal redirects, but I think it is a good start to at least address this, with any final solution building around this convention. Graham On 13/03/2006, at 2:39 PM, Gregory (Grisha) Trubetskoy wrote: I'm -1 on get_session() too. The request object is supposed to be a representation of Apache's request, and get_session() just does not belong there. Grisha On Sun, 12 Mar 2006, Jim Gallacher wrote: I handn't really intended to start working on an implementation. I just don't like seeing all those issues in JIRA marked as unassigned. :) Since I created it I figured I should take some responsibility for it. Plus, it's a gentle reminder when I list my assigned issues - resolve it one way or another. I still think we need some sort of solution to the problem of people trying to create 2 session instances in the same request, but I agree that the original concept of req.get_session() was not quite right. Jim Graham Dumpleton wrote: I would rather we not go ahead with adding req.get_session() at this time. At least not how it was envisaged to be done previously. I'll come back with a bit of analysis after I review where we were up to previously. Graham On 12/03/2006, at 8:47 AM, Jim Gallacher (JIRA) wrote: [ http://issues.apache.org/jira/browse/MODPYTHON-59?page=all ] Jim Gallacher reassigned MODPYTHON-59: -- Assign To: Jim Gallacher Add get_session() method to request object -- Key: MODPYTHON-59 URL: http://issues.apache.org/jira/browse/MODPYTHON-59 Project: mod_python Type: New Feature Components: session Versions: 3.1.4, 3.1.3, 3.2.7 Environment: All Reporter: Jim Gallacher Assignee: Jim Gallacher Fix For: 3.3 Attachments: Session.py.diff.txt Users will get session instances by calling req.get_session(). If a session already exists it will be returned, otherwise a new session instance will be created. Session configuration will be handled using apache directives rather than within their code. Using this scheme means only one session instance will be created per request, which will eliminate the deadlock problems many people experience. Also, using this scheme makes it possible for sessions to be properly handled within psp pages and across req.internal_redirect() calls. Code will be commited to svn shortly. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: get_session(), req.session, req.form and MODPYTHON-38
On Mon, 13 Mar 2006, Graham Dumpleton wrote: Thus I want a documented convention that if a handler is going to use util.FieldStorage, that it should before doing so, first check whether an existing instance resides as req.form and use that instead. I'm not sure if this is a good example - req.form is something specific to the publisher. Rather than perhaps documenting it as you suggest, util.FieldStorage can take it upon itself to create a req.form so that subsequent attempts to instantiate it just return req.form. (This is just an example, I'm not 100% sure that I having FS do this makes sense - seems like a good idea). Similarly, if a handler is going to create a Session object, that it look for an existing instance as req.session and again use that instead. OR, the Session module would know to look for a req.session, in which case the handlers wouldn't need to worry about it. (One thing to watch out for would be that mutliple concurrent sessions in the same request is a possibility) Grisha
Re: get_session(), req.session, req.form and MODPYTHON-38
Grisha wrote .. On Mon, 13 Mar 2006, Graham Dumpleton wrote: Thus I want a documented convention that if a handler is going to use util.FieldStorage, that it should before doing so, first check whether an existing instance resides as req.form and use that instead. I'm not sure if this is a good example - req.form is something specific to the publisher. Rather than perhaps documenting it as you suggest, util.FieldStorage can take it upon itself to create a req.form so that subsequent attempts to instantiate it just return req.form. (This is just an example, I'm not 100% sure that I having FS do this makes sense - seems like a good idea). Similarly, if a handler is going to create a Session object, that it look for an existing instance as req.session and again use that instead. OR, the Session module would know to look for a req.session, in which case the handlers wouldn't need to worry about it. (One thing to watch out for would be that mutliple concurrent sessions in the same request is a possibility) Hmmm, having a look at the code, at some point the check for req.session got added and I didn't realise or forgot that it had been done. # does this code use session? session = None if session in code.co_names: if hasattr(req, 'session'): session = req.session else: session = Session.Session(req) It didn't get added for form though, which means that accessing form arguments from within a PSP page will mean only those in the query string of the URL will be available as the content of the request has already been consumed. Looks like a audit of both: https://issues.apache.org/jira/browse/MODPYTHON-38 https://issues.apache.org/jira/browse/MODPYTHON-59 need to be done to work out what has and hasn't been done related to this so we know where we are up to. Looks a bit like when the req.get_session() changes got rolled back that it got introduced at that point: http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=226320view=diffr1=226320r2=226319p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py Before the req.get_session() change it didn't exist: http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=191745view=diffr1=191745r2=191744p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py I can't remember if this was a conscious decision to check for req.session based on suggestions or otherwise. Graham
Re: get_session(), req.session, req.form and MODPYTHON-38
Graham Dumpleton wrote: Grisha wrote .. On Mon, 13 Mar 2006, Graham Dumpleton wrote: Thus I want a documented convention that if a handler is going to use util.FieldStorage, that it should before doing so, first check whether an existing instance resides as req.form and use that instead. I'm not sure if this is a good example - req.form is something specific to the publisher. Rather than perhaps documenting it as you suggest, util.FieldStorage can take it upon itself to create a req.form so that subsequent attempts to instantiate it just return req.form. (This is just an example, I'm not 100% sure that I having FS do this makes sense - seems like a good idea). Similarly, if a handler is going to create a Session object, that it look for an existing instance as req.session and again use that instead. OR, the Session module would know to look for a req.session, in which case the handlers wouldn't need to worry about it. (One thing to watch out for would be that mutliple concurrent sessions in the same request is a possibility) The problem is that you are still depending on a naming convention which requires 2 things from users: they read the docs and they adopt the convention. Both are losing propostions, IMHO. Heck, *I* don't read the docs. :) The idea of something like req.get_session() is to give users an obvious way to grab a session object without the deadlock concerns. How many times have we seen this kind of problem-code on the mailing list? def index(req): sess = Session.Session(req) do_stuff(req) ... def do_stuff(req): sess = Session.Session(req) ... do other stuff. Having the session constructor check for the existence of req.session is of no use here. We need a way to make sure only *one* session instance is created per request. (Bonus marks for making it work with internal redirect). Hmmm, having a look at the code, at some point the check for req.session got added and I didn't realise or forgot that it had been done. # does this code use session? session = None if session in code.co_names: if hasattr(req, 'session'): session = req.session else: session = Session.Session(req) It didn't get added for form though, which means that accessing form arguments from within a PSP page will mean only those in the query string of the URL will be available as the content of the request has already been consumed. It didn't get added for form handling because I was fixing the session code. ;) To me the session problem is the more serious of the two. Mess up your form code and you have to rewrite your code. This is inconvenient. Mess up your session code and you deadlock your server. This is very bad. Looks like a audit of both: https://issues.apache.org/jira/browse/MODPYTHON-38 https://issues.apache.org/jira/browse/MODPYTHON-59 need to be done to work out what has and hasn't been done related to this so we know where we are up to. Looks a bit like when the req.get_session() changes got rolled back that it got introduced at that point: http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=226320view=diffr1=226320r2=226319p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py Before the req.get_session() change it didn't exist: http://svn.apache.org/viewcvs.cgi//httpd/mod_python/trunk/lib/python/mod_python/psp.py?rev=191745view=diffr1=191745r2=191744p1=/httpd/mod_python/trunk/lib/python/mod_python/psp.pyp2=/httpd/mod_python/trunk/lib/python/mod_python/psp.py I can't remember if this was a conscious decision to check for req.session based on suggestions or otherwise. Graham Although the psp.py code in question was put in when I was messing around with get_session, it is not an artifact of get_session being incompletely rolled back. Ignoring the whole get_session stuff for a moment, the diff would look like this: Index: psp.py === --- psp.py (revision 164956) +++ psp.py (revision 226320) @@ -190,7 +190,10 @@ # does this code use session? session = None if session in code.co_names: -session = Session.Session(req) +if hasattr(req, 'session'): +session = req.session +else: +session = Session.Session(req) This new code helps to avoid the deadlock problem if the user had the foresight to stuff their session object into req. The biggest problem with this change is not it's validity, but the fact that it was not documented. Anyway, the get_session code in requestobject.c was half-baked and should not have been checked into trunk. Furthermore, when it was decided to defer this issue to 3.3, I should have expunged it completely rather than leaving it as a stub. I guess at the time I
Re: get_session(), req.session, req.form and MODPYTHON-38
Jim Gallacher wrote .. The idea of something like req.get_session() is to give users an obvious way to grab a session object without the deadlock concerns. How many times have we seen this kind of problem-code on the mailing list? def index(req): sess = Session.Session(req) do_stuff(req) ... def do_stuff(req): sess = Session.Session(req) ... do other stuff. Having the session constructor check for the existence of req.session is of no use here. We need a way to make sure only *one* session instance is created per request. (Bonus marks for making it work with internal redirect). FWIW, I use the following in a class based authenhandler. thread = threading.currentThread() if self.__cache.has_key(thread): req.session = self.__cache[thread] else: req.session = Session.Session(req) self.__cache[thread] = req.session def cleanup(data): del self.__cache[thread] req.register_cleanup(cleanup) In short, store it in a cache outside of the request object keys by the thread ID. Works for both internal redirects and also for fast redirects as used by DirectoryIndex matching algorithm, although for the latter there are other issues as I have documented in MODPYTHON-146. My thinking keeps changing a bit, but overall I have been leaning towards not having a get_session() like function explicitly provided and instead documenting how to correctly write a authenhandler which can handle form based login with sessions. Ie., use Apache phases properly rather than pushing authentication into content handler phase as most do. Unfortunately, I keep finding things in mod_python that need to be improved or fixed to avoiding having to fudge things, thus haven't presented my code for others to look at yet. :-( Graham
Re: get_session(), req.session, req.form and MODPYTHON-38
Graham Dumpleton wrote: Jim Gallacher wrote .. The idea of something like req.get_session() is to give users an obvious way to grab a session object without the deadlock concerns. How many times have we seen this kind of problem-code on the mailing list? def index(req): sess = Session.Session(req) do_stuff(req) ... def do_stuff(req): sess = Session.Session(req) ... do other stuff. Having the session constructor check for the existence of req.session is of no use here. We need a way to make sure only *one* session instance is created per request. (Bonus marks for making it work with internal redirect). FWIW, I use the following in a class based authenhandler. thread = threading.currentThread() if self.__cache.has_key(thread): req.session = self.__cache[thread] else: req.session = Session.Session(req) self.__cache[thread] = req.session def cleanup(data): del self.__cache[thread] req.register_cleanup(cleanup) In short, store it in a cache outside of the request object keys by the thread ID. Works for both internal redirects and also for fast redirects as used by DirectoryIndex matching algorithm, although for the latter there are other issues as I have documented in MODPYTHON-146. My thinking keeps changing a bit, but overall I have been leaning towards not having a get_session() like function explicitly provided and instead documenting how to correctly write a authenhandler which can handle form based login with sessions. Ie., use Apache phases properly rather than pushing authentication into content handler phase as most do. Which is all good, but you are assuming that people are only using sessions for authentication purposes. Consider a shopping cart implemented as session: the user may not be authenticated until *after* they have filled their cart and are ready to checkout. Perhaps the cache code would be better off in Session.py? Jim
Re: get_session(), req.session, req.form and MODPYTHON-38
Jim Gallacher wrote .. Which is all good, but you are assuming that people are only using sessions for authentication purposes. Consider a shopping cart implemented as session: the user may not be authenticated until *after* they have filled their cart and are ready to checkout. Perhaps the cache code would be better off in Session.py? In the case where a session is required across both public and private areas of a web site where login is required for the private area, the example I am working on handles that. I am still working out what presents as being the more sensible way, but Apache configuration would be something like: AuthType Session::Private Require valid-user PythonAuthenHandler _sessionmanager Files login.html AuthType Session::Public /Files or: AuthType Session AuthName Private Area Require valid-user PythonAuthenHandler _sessionmanager Files login.html AuthName Public Area /Files The benefit of using an authenhandler in this case is that it doesn't particularly matter what is being used for the content handler phase as longs as access to resources can be controlled based on filename matched by Apache for the URL, or possibly by the URL itself. Thus, you could be using publisher or PSP, or even a mix. Whatever is used, it just uses the req.session object created for it by the authenhandler. Sure the code might not handle every single possible use case, but its purpose was an example, not something to be embodied into any package. Thus, it could by all means be customised. The important thing is illustrates how to do all the hard stuff that people don't generally get correct. Graham
Re: get_session(), req.session, req.form and MODPYTHON-38
On Mon, 13 Mar 2006, Jim Gallacher wrote: The idea of something like req.get_session() is to give users an obvious way to grab a session object without the deadlock concerns. How many times have we seen this kind of problem-code on the mailing list? def index(req): sess = Session.Session(req) do_stuff(req) ... def do_stuff(req): sess = Session.Session(req) ... do other stuff. Having the session constructor check for the existence of req.session is of no use here. We need a way to make sure only *one* session instance is created per request. (Bonus marks for making it work with internal redirect). [sorry, i only read the beginning of the message, so i might be not fully understanding] Session.Session is not a constructor, just a function. But also, if it were, I think this can be solved with the new object's __new__() ? Grisha
Re: get_session(), req.session, req.form and MODPYTHON-38
Gregory (Grisha) Trubetskoy wrote: On Mon, 13 Mar 2006, Jim Gallacher wrote: The idea of something like req.get_session() is to give users an obvious way to grab a session object without the deadlock concerns. How many times have we seen this kind of problem-code on the mailing list? def index(req): sess = Session.Session(req) do_stuff(req) ... def do_stuff(req): sess = Session.Session(req) ... do other stuff. Having the session constructor check for the existence of req.session is of no use here. We need a way to make sure only *one* session instance is created per request. (Bonus marks for making it work with internal redirect). [sorry, i only read the beginning of the message, so i might be not fully understanding] Session.Session is not a constructor, just a function. But also, if it were, I think this can be solved with the new object's __new__() ? You're right, I misspoke. It is a function, but it does return a new session instance so there is a constructor in there somewhere. ;) Jim