[jira] Updated: (MODPYTHON-46) PythonHandlerModule directive is broken.
[ http://issues.apache.org/jira/browse/MODPYTHON-46?page=all ] Graham Dumpleton updated MODPYTHON-46: -- Attachment: silent.diff.txt This fixes the wrong logic problem in the definition of SILENT/NOTSILENT. It avoids the infinite loop bug described by not having PythonHandlerModule do anything for PythonConnectionHandler. The PythonConnectionHandler is different because the handler is being added against the handler list of the server config object and not module config. The bug is somehow associated with the server config object although still don't know why. Since PythonHandlerModule is broken altogether at the moment, I don't see a problem with it being made to work for normal module handlers and ignore the connection handler, at least for now. It could even be possible that PythonConnectionHandler directive is broken as well but since no one uses it, no one has noticed. The most important thing here is that the reverse logic in SILENT/NOTSILENT will be fixed and saying: PythonHandler mptest::xxx where xxx doesn't exist, will yield an error, whereas at present no error results and instead mptest.py source code could be served up instead. PythonHandlerModule directive is broken. Key: MODPYTHON-46 URL: http://issues.apache.org/jira/browse/MODPYTHON-46 Project: mod_python Type: Bug Components: core Versions: 3.1.4 Reporter: Graham Dumpleton Attachments: silent.diff.txt Documentation for PythonHandlerModule says: PythonHandlerModule can be used an alternative to Python*Handler directives. The module specified in this handler will be searched for existence of functions matching the default handler function names, and if a function is found, it will be executed. The suggestion is that it will not complain if a particular handler is defined, ie., only executes the ones it finds and doesn't worry about the rest. The example even supports this by saying that: For example, instead of: PythonAutenHandler mymodule PythonHandler mymodule PythonLogHandler mymodule one can simply say PythonHandlerModule mymodule BTW, PythonAutenHandler is spelt wrong in documentation, not by me. The mod_python.c code also seems be coded so that if a handler is defined in the module that it will not complain. python_directive_handler(cmd, mconfig, PythonPostReadRequestHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonTransHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonHeaderParserHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonAccessHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonAuthzHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonTypeHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonInitHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonLogHandler, val, SILENT); python_directive_handler(cmd, mconfig, PythonCleanupHandler, val, SILENT); python_directive_handler(cmd, srv_conf, PythonConnectionHandler, val, SILENT); Ie., it has SILENT option and not NOTSILENT as is case when single handler is specified. Problem is that using PythonHandlerModule it gives back 500 error and if PythonDebug is on you will see in the browser: Mod_python error: PythonHeaderParserHandler mptest Traceback (most recent call last): File /usr/lib/python2.3/site-packages/mod_python/apache.py, line 291, in HandlerDispatch arg=req, silent=hlist.silent) File /usr/lib/python2.3/site-packages/mod_python/apache.py, line 519, in resolve_object raise AttributeError, s AttributeError: module '/home/grahamd/public_html/phases/mptest.py' contains no 'headerparserhandler' The passing of SILENT thus seems to not work. The definitions of SILENT and NOTSILENT are: #define SILENT 0 #define NOTSILENT 1 This eventually gets set as hlist.silent and gets passed as silent argument of the resolve_object() method. In the resolve_object() call of apache.py where this is checked, it is checked as: # don't throw attribute errors when silent if silent and not hasattr(obj, obj_str): return None # this adds a little clarity if we have an attriute error if obj == module and not hasattr(module, obj_str): if hasattr(module, __file__): s = module '%s' contains no '%s' % (module.__file__, obj_str) raise AttributeError, s Is the logic the wrong way around here or am I just going nuts? The result of resolve_object() is used as: if object: ... elif hlist.silent: result = DECLINED This is supposed
[jira] Commented: (MODPYTHON-68) Add a readable/writable req.script_name member.
[ http://issues.apache.org/jira/browse/MODPYTHON-68?page=comments#action_12318143 ] Graham Dumpleton commented on MODPYTHON-68: --- I add my own -1 to this patch to add req.script_name. It is just as easy to stick it in a root level handler middleware wrapper. I would still like req.path_info to be writable though as it becomes impossible to write middleware wrappers that incorporate existing mod_python.publisher and mod_python.psp code otherwise. :-) The information here is still pertinent in pointing out that existing ways in which SCRIPT_NAME is determined is wrong in some situations. Add a readable/writable req.script_name member. --- Key: MODPYTHON-68 URL: http://issues.apache.org/jira/browse/MODPYTHON-68 Project: mod_python Type: New Feature Components: core Versions: 3.2.0 Reporter: Graham Dumpleton Attachments: apache.py.diff The term SCRIPT_NAME in web servers is used to identify that part of a URI which identifies the script handling the request. Within the URI, the SCRIPT_NAME component would be followed by the PATH_INFO component, the latter being potentially an empty string. In mod_python, the value of SCRIPT_NAME could be obtained in a few different ways. These are: 1. Obtain it as req.subprocess_env[SCRIPT_NAME] after having first called req.add_common_vars(). 2. Obtain it as apache.build_cgi_env(req)[SCRIPT_NAME]. This internally calls req.add_common_vars() but then ignores SCRIPT_NAME value from req.subprocess_env and instead tries to calculate it as per (3) below yeilding a different result to (1) in some cases. 3. Attempt to derive it req.uri using code which is based upon something like 'req.uri[:-len(req.path_info)]'. If req.path_info is empty, then the result should be the same as req.uri. All three methods actually yield incorrect results in certain circumstances, with the fact that it occurs in (1) suggesting an underlying Apache bug. The problem area is where there are multiple successive occurrences of '/' appearing in the part of the URI which is used to determine the PATH_INFO value. Looking at some examples for each case we get: req.uri = /~grahamd/handler/mptest.py req.path_info = PATH_INFO = None SCRIPT_NAME (1) = /~grahamd/handler/mptest.py SCRIPT_NAME (2) = /~grahamd/handler/mptest.py SCRIPT_NAME (3) = /~grahamd/handler/mptest.py req.uri = /~grahamd/handler/mptest.py/ req.path_info = / PATH_INFO = / SCRIPT_NAME (1) = /~grahamd/handler/mptest.py SCRIPT_NAME (2) = /~grahamd/handler/mptest.py SCRIPT_NAME (3) = /~grahamd/handler/mptest.py req.uri = /~grahamd/handler/mptest.py// req.path_info = / PATH_INFO = / SCRIPT_NAME (1) = /~grahamd/handler/mptest.py SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/ SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/ req.uri = /~grahamd/handler/mptest.py/a req.path_info = /a PATH_INFO = /a SCRIPT_NAME (1) = /~grahamd/handler/mptest.py SCRIPT_NAME (2) = /~grahamd/handler/mptest.py SCRIPT_NAME (3) = /~grahamd/handler/mptest.py req.uri = /~grahamd/handler/mptest.py/a/b req.path_info = /a/b PATH_INFO = /a/b SCRIPT_NAME (1) = /~grahamd/handler/mptest.py SCRIPT_NAME (2) = /~grahamd/handler/mptest.py SCRIPT_NAME (3) = /~grahamd/handler/mptest.py req.uri = /~grahamd/handler/mptest.py/a//b req.path_info = /a/b PATH_INFO = /a/b SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/ SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/ req.uri = /~grahamd/handler/mptest.py/a///b req.path_info = /a/b PATH_INFO = /a/b SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a/ SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/a SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/a req.uri = /~grahamd/handler/mptest.py/a///b//c req.path_info = /a/b/c PATH_INFO = /a/b/c SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a///b SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/a/ SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/a/ All very strange and not what one would expect. Ignoring the strange results, the first point of creating the tracker item is to propose that a new member be added to the request object referred to as req.script_name. This new member should be both readable and writable. The argument for adding script_name is similar to that for making path_info writable as described in MODPYTHON-67. That is, it would make the task of writing a middleware stack specifically for mod_python but in a similar style to WSGI a slightly simpler task. In adding script_name, it is perhaps suggested that its initial value be somewhat saner than as shown in the results above. More along the lines of: req.uri = /~grahamd/handler/mptest.py// req.path_info = / script_name=/~grahamd/handler/mptest.py req.uri = /~grahamd/handler/mptest.py/a//b req.path_info = /a/b script_name=/~grahamd/handler/mptest.py req.uri =
Re: [jira] Commented: (MODPYTHON-70) Add configure --with-max-locks option to set MAX_LOCKS.
This raise this issue : under Win32, the preferred way to build mod_python is to run : python setup.py.in bdist_wininst --install-script win32_postinstall.py This leaves no room to specify a MAX_LOCK definition override, but I guess we could put it in setup.py, since extension modules can have custome macro definitions. My question is : should we keep on with ./configure ; make ; make install or try to do everything in setup.py ? Regards, Nicolas 2005/8/9, Jim Gallacher (JIRA) [EMAIL PROTECTED]: [ http://issues.apache.org/jira/browse/MODPYTHON-70?page=comments#action_12318192 ]Jim Gallacher commented on MODPYTHON-70: Changes committed. This issue can be closed. Add configure --with-max-locks option to set MAX_LOCKS. --- Key: MODPYTHON-70URL: http://issues.apache.org/jira/browse/MODPYTHON-70Project: mod_python Type: New Feature Components: core Versions: 3.2.0Environment: All Reporter: Jim Gallacher Priority: Trivial MAX_LOCKS in src/include/mod_python.h is currently hard coded (currently 32). Since the number of mutexes on some systems is limited, users may prefer to use a different number when compiling mod_python. The now configure would be--with-max-locks=INTERGER. eg. $ ./configure --with-max-locks=4 The default should also be lower than 32. Grisha has suggested 8. I'll commit the changes shortly.--This message is automatically generated by JIRA.-If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa-For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] Commented: (MODPYTHON-70) Add configure --with-max-locks option to set MAX_LOCKS.
On Tue, 9 Aug 2005, Jim Gallacher wrote: My question is : should we keep on with ./configure ; make ; make install or try to do everything in setup.py ? As long as we can put setup.py in a Makefile. ;) Seriously though, ./configure --help; ./configure; make; make install; is just such second nature to me I never even thought of the alternative. Yep, and I think setup.py isn't smart enough to deal with the Apache side of mod_python which is C-centric. But in any event, I think ./configure is the way to go. Grisha
Re: _apache._global_lock theory
Jim Gallacher writes: Daniel Popowich wrote: The recent discussion of max locks and deadlocking issues with _apache._global_(un)?lock() are timely for me: I'm in the middle of writing a caching module for mod_python servlets so a developer can have the output of a servlet cached, keyed on the hash of the URI, for future requests. The goal, of course, is to increase throughput for dynamic pages with long shelf life (e.g., content manager updates database once per day, so the HTML only needs to be generated once per day, not per request). I need locking. My first gut instinct was to go to the fcntl module (lockf), but this is only available for unix. My second gut instinct was: so what? :-) Wanting a true x-platform tool I then thought of mod_python.Session needing locks, poked around the code, and saw how it uses _apache._global_lock(). Further poking around showed me how psp code caching uses the function. About the time I began worrying about deadlock issues I found the thread on this list discussing the same problems. The solution for Session and psp code caching is to explicitly use lock id 0. This works as long as a module does not hold the lock for the whole request, but unlocks immediately after acquiring the needed resource. Fine. Just to be clear, sessions use the locks above index 0 for session locking. The session id is hashed to determine which index is used. DbmSession uses lock index 0 to lock the dbm file for reading and writing the persistent session data. This is independent of the session lock. Right, not sessions proper, but by the mod_python.Session module for dbm file reading. So, my question: is this the recommended way for mod_python framework developers to acquire x-platform global locks? Explicitly use lock id 0? If so, is this a secret or should it be documented? I don't know if it's recommended, but I don't see a problem as long as the lock is held briefly and you make sure you unlock it when you are done. I suspect it is undocumented because it was never documented as opposed to some larger conspiracy. I guess I was too tongue-in-cheek...my question is: Is it not documented on *purpose*? Perhaps it should be documented for internal developers and framework developers? I used another cross platform approach in filesession_cleanup() in Session.py. I wanted to make sure only one request at a time was running the cleanup, and used the os.open() call to exclusively open a guard file. (OK, not a guard file, but my brain just went blank. Hopefully you get the idea.) I'm with ya... :-) Here is a code snippet: Thanks for the code...maybe I'll try both (your code and _apache._global_lock()) and benchmark my caching code with ab. Thinking out loud here...wouldn't it be good for mod_python to provide a facility for global locking based on some key? By default, the lock is per interpreter, but optionally per server? Given the oddities of python programming within an apache environment, especially a prefork MPM environment, it seems it would be a most valueable service. The Session, psp and 3rd-party locking (e.g. mpservlets) could all share the same code. Daniel Popowich --- http://home.comcast.net/~d.popowich/mpservlets/
Re: _apache._global_lock theory
Daniel Popowich wrote: Jim Gallacher writes: Daniel Popowich wrote: The recent discussion of max locks and deadlocking issues with _apache._global_(un)?lock() are timely for me: I'm in the middle of writing a caching module for mod_python servlets so a developer can have the output of a servlet cached, keyed on the hash of the URI, for future requests. The goal, of course, is to increase throughput for dynamic pages with long shelf life (e.g., content manager updates database once per day, so the HTML only needs to be generated once per day, not per request). I need locking. My first gut instinct was to go to the fcntl module (lockf), but this is only available for unix. My second gut instinct was: so what? :-) Wanting a true x-platform tool I then thought of mod_python.Session needing locks, poked around the code, and saw how it uses _apache._global_lock(). Further poking around showed me how psp code caching uses the function. About the time I began worrying about deadlock issues I found the thread on this list discussing the same problems. The solution for Session and psp code caching is to explicitly use lock id 0. This works as long as a module does not hold the lock for the whole request, but unlocks immediately after acquiring the needed resource. Fine. Just to be clear, sessions use the locks above index 0 for session locking. The session id is hashed to determine which index is used. DbmSession uses lock index 0 to lock the dbm file for reading and writing the persistent session data. This is independent of the session lock. Right, not sessions proper, but by the mod_python.Session module for dbm file reading. So, my question: is this the recommended way for mod_python framework developers to acquire x-platform global locks? Explicitly use lock id 0? If so, is this a secret or should it be documented? I don't know if it's recommended, but I don't see a problem as long as the lock is held briefly and you make sure you unlock it when you are done. I suspect it is undocumented because it was never documented as opposed to some larger conspiracy. I guess I was too tongue-in-cheek...my question is: Is it not documented on *purpose*? Perhaps it should be documented for internal developers and framework developers? No, actually I understood your cheek. I was just too lazy to put in a smiley after my comment. I shall correct that now. :) And a winkey for good measure. ;) I used another cross platform approach in filesession_cleanup() in Session.py. I wanted to make sure only one request at a time was running the cleanup, and used the os.open() call to exclusively open a guard file. (OK, not a guard file, but my brain just went blank. Hopefully you get the idea.) I'm with ya... :-) Here is a code snippet: Thanks for the code...maybe I'll try both (your code and _apache._global_lock()) and benchmark my caching code with ab. Thinking out loud here...wouldn't it be good for mod_python to provide a facility for global locking based on some key? By default, the lock is per interpreter, but optionally per server? Given the oddities of python programming within an apache environment, especially a prefork MPM environment, it seems it would be a most valueable service. The Session, psp and 3rd-party locking (e.g. mpservlets) could all share the same code. That discussion will have to wait for another time. Time to call it a day. Regards, Jim