[jira] Updated: (MODPYTHON-46) PythonHandlerModule directive is broken.

2005-08-09 Thread Graham Dumpleton (JIRA)
 [ http://issues.apache.org/jira/browse/MODPYTHON-46?page=all ]

Graham Dumpleton updated MODPYTHON-46:
--

Attachment: silent.diff.txt

This fixes the wrong logic problem in the definition of SILENT/NOTSILENT.

It avoids the infinite loop bug described by not having PythonHandlerModule do 
anything for PythonConnectionHandler. The PythonConnectionHandler is 
different because the handler is being added against the handler list of the 
server config object and not module config. The bug is somehow associated with 
the server config object although still don't know why.

Since PythonHandlerModule is broken altogether at the moment, I don't see a 
problem with it being made to work for normal module handlers and ignore the 
connection handler, at least for now. It could even be possible that 
PythonConnectionHandler directive is broken as well but since no one uses it, 
no one has noticed.

The most important thing here is that the reverse logic in SILENT/NOTSILENT 
will be fixed and saying:

  PythonHandler mptest::xxx

where xxx doesn't exist, will yield an error, whereas at present no error 
results and instead mptest.py source code could be served up instead.

 PythonHandlerModule directive is broken.
 

  Key: MODPYTHON-46
  URL: http://issues.apache.org/jira/browse/MODPYTHON-46
  Project: mod_python
 Type: Bug
   Components: core
 Versions: 3.1.4
 Reporter: Graham Dumpleton
  Attachments: silent.diff.txt

 Documentation for PythonHandlerModule says:
   PythonHandlerModule can be used an alternative to Python*Handler directives.
   The module specified in this handler will be searched for existence of 
 functions
   matching the default handler function names, and if a function is found, it 
 will
   be executed.
 The suggestion is that it will not complain if a particular handler is 
 defined, ie.,
 only executes the ones it finds and doesn't worry about the rest. The example
 even supports this by saying that:
   For example, instead of:
 PythonAutenHandler mymodule
 PythonHandler mymodule
 PythonLogHandler mymodule
   one can simply say
 PythonHandlerModule mymodule
 BTW, PythonAutenHandler is spelt wrong in documentation, not by me.
 The mod_python.c code also seems be coded so that if a handler is defined
 in the module that it will not complain.
 python_directive_handler(cmd, mconfig, PythonPostReadRequestHandler, 
 val, SILENT);
 python_directive_handler(cmd, mconfig, PythonTransHandler, val, SILENT);
 python_directive_handler(cmd, mconfig, PythonHeaderParserHandler, val, 
 SILENT);
 python_directive_handler(cmd, mconfig, PythonAccessHandler, val, 
 SILENT);
 python_directive_handler(cmd, mconfig, PythonAuthzHandler, val, SILENT);
 python_directive_handler(cmd, mconfig, PythonTypeHandler, val, SILENT);
 python_directive_handler(cmd, mconfig, PythonHandler, val, SILENT);
 python_directive_handler(cmd, mconfig, PythonInitHandler, val, SILENT);
 python_directive_handler(cmd, mconfig, PythonLogHandler, val, SILENT);
 python_directive_handler(cmd, mconfig, PythonCleanupHandler, val, 
 SILENT);
 python_directive_handler(cmd, srv_conf, PythonConnectionHandler, val, 
 SILENT);
 Ie., it has SILENT option and not NOTSILENT as is case when single 
 handler is
 specified.
 Problem is that using PythonHandlerModule it gives back 500 error and if
 PythonDebug is on you will see in the browser:
   Mod_python error: PythonHeaderParserHandler mptest
   Traceback (most recent call last):
 File /usr/lib/python2.3/site-packages/mod_python/apache.py, line 291, 
 in HandlerDispatch
   arg=req, silent=hlist.silent)
 File /usr/lib/python2.3/site-packages/mod_python/apache.py, line 519, 
 in resolve_object
   raise AttributeError, s
   AttributeError: module '/home/grahamd/public_html/phases/mptest.py' 
 contains no 'headerparserhandler'
 The passing of SILENT thus seems to not work.
 The definitions of SILENT and NOTSILENT are:
   #define SILENT 0
   #define NOTSILENT 1
 This eventually gets set as hlist.silent and gets passed as silent argument 
 of
 the resolve_object() method.
 In the resolve_object() call of apache.py where this is checked, it is 
 checked as:
 # don't throw attribute errors when silent
 if silent and not hasattr(obj, obj_str):
 return None
 # this adds a little clarity if we have an attriute error
 if obj == module and not hasattr(module, obj_str):
 if hasattr(module, __file__):
 s = module '%s' contains no '%s' % (module.__file__, 
 obj_str)
 raise AttributeError, s
 Is the logic the wrong way around here or am I just going nuts?
 The result of resolve_object() is used as:
 if object:
 ...
 elif hlist.silent:
 result = DECLINED
 This is supposed 

[jira] Commented: (MODPYTHON-68) Add a readable/writable req.script_name member.

2005-08-09 Thread Graham Dumpleton (JIRA)
[ 
http://issues.apache.org/jira/browse/MODPYTHON-68?page=comments#action_12318143 
] 

Graham Dumpleton commented on MODPYTHON-68:
---

I add my own -1 to this patch to add req.script_name. It is just as easy to 
stick it in a root level handler middleware wrapper. I would still like 
req.path_info to be writable though as it becomes impossible to write 
middleware wrappers that incorporate existing mod_python.publisher and 
mod_python.psp code otherwise. :-)

The information here is still pertinent in pointing out that existing ways in 
which SCRIPT_NAME is determined is wrong in some situations.

 Add a readable/writable req.script_name member.
 ---

  Key: MODPYTHON-68
  URL: http://issues.apache.org/jira/browse/MODPYTHON-68
  Project: mod_python
 Type: New Feature
   Components: core
 Versions: 3.2.0
 Reporter: Graham Dumpleton
  Attachments: apache.py.diff

 The term SCRIPT_NAME in web servers is used to identify that part of a
 URI which identifies the script handling the request. Within the URI,
 the SCRIPT_NAME component would be followed by the PATH_INFO component,
 the latter being potentially an empty string.
 In mod_python, the value of SCRIPT_NAME could be obtained in a few
 different ways. These are:
 1. Obtain it as req.subprocess_env[SCRIPT_NAME] after having first
 called req.add_common_vars().
 2. Obtain it as apache.build_cgi_env(req)[SCRIPT_NAME]. This
 internally calls req.add_common_vars() but then ignores SCRIPT_NAME
 value from req.subprocess_env and instead tries to calculate it as per
 (3) below yeilding a different result to (1) in some cases.
 3. Attempt to derive it req.uri using code which is based upon something
 like 'req.uri[:-len(req.path_info)]'. If req.path_info is empty, then the 
 result
 should be the same as req.uri.
 All three methods actually yield incorrect results in certain
 circumstances, with the fact that it occurs in (1) suggesting an
 underlying Apache bug.
 The problem area is where there are multiple successive occurrences of
 '/' appearing in the part of the URI which is used to determine the
 PATH_INFO value.
 Looking at some examples for each case we get:
 req.uri = /~grahamd/handler/mptest.py
 req.path_info = 
 PATH_INFO = None
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py
 req.uri = /~grahamd/handler/mptest.py/
 req.path_info = /
 PATH_INFO = /
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py
 req.uri = /~grahamd/handler/mptest.py//
 req.path_info = /
 PATH_INFO = /
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/
 req.uri = /~grahamd/handler/mptest.py/a
 req.path_info = /a
 PATH_INFO = /a
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py
 req.uri = /~grahamd/handler/mptest.py/a/b
 req.path_info = /a/b
 PATH_INFO = /a/b
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py
 req.uri = /~grahamd/handler/mptest.py/a//b
 req.path_info = /a/b
 PATH_INFO = /a/b
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/
 req.uri = /~grahamd/handler/mptest.py/a///b
 req.path_info = /a/b
 PATH_INFO = /a/b
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a/
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/a
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/a
 req.uri = /~grahamd/handler/mptest.py/a///b//c
 req.path_info = /a/b/c
 PATH_INFO = /a/b/c
 SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a///b
 SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/a/
 SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/a/
 All very strange and not what one would expect.
 Ignoring the strange results, the first point of creating the tracker
 item is to propose that a new member be added to the request object
 referred to as req.script_name. This new member should be both
 readable and writable.
 The argument for adding script_name is similar to that for making
 path_info writable as described in MODPYTHON-67. That is, it would
 make the task of writing a middleware stack specifically for mod_python
 but in a similar style to WSGI a slightly simpler task.
 In adding script_name, it is perhaps suggested that its initial value be
 somewhat saner than as shown in the results above. More along the lines
 of:
 req.uri = /~grahamd/handler/mptest.py//
 req.path_info = /
 script_name=/~grahamd/handler/mptest.py
 req.uri = /~grahamd/handler/mptest.py/a//b
 req.path_info = /a/b
 script_name=/~grahamd/handler/mptest.py
 req.uri = 

Re: [jira] Commented: (MODPYTHON-70) Add configure --with-max-locks option to set MAX_LOCKS.

2005-08-09 Thread Nicolas Lehuen
This raise this issue : under Win32, the preferred way to build mod_python is to run :

python setup.py.in bdist_wininst --install-script win32_postinstall.py

This leaves no room to specify a MAX_LOCK definition override, but I
guess we could put it in setup.py, since extension modules can have
custome macro definitions.

My question is : should we keep on with ./configure ; make ; make install or try to do everything in setup.py ?

Regards,
Nicolas
2005/8/9, Jim Gallacher (JIRA) [EMAIL PROTECTED]:
[ http://issues.apache.org/jira/browse/MODPYTHON-70?page=comments#action_12318192 ]Jim Gallacher commented on MODPYTHON-70:
Changes committed. This issue can be closed. Add configure --with-max-locks option to set MAX_LOCKS. ---
Key: MODPYTHON-70URL: http://issues.apache.org/jira/browse/MODPYTHON-70Project: mod_python Type: New Feature
 Components: core Versions: 3.2.0Environment: All Reporter: Jim Gallacher Priority: Trivial MAX_LOCKS in src/include/mod_python.h is currently hard coded (currently 32).

Since the number of mutexes on some systems is limited, users may
prefer to use a different number when compiling mod_python. The now configure would be--with-max-locks=INTERGER. eg. $ ./configure --with-max-locks=4 The default should also be lower than 32. Grisha has suggested 8.
 I'll commit the changes shortly.--This message is automatically generated by JIRA.-If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa-For more information on JIRA, see: http://www.atlassian.com/software/jira



Re: [jira] Commented: (MODPYTHON-70) Add configure --with-max-locks option to set MAX_LOCKS.

2005-08-09 Thread Gregory (Grisha) Trubetskoy


On Tue, 9 Aug 2005, Jim Gallacher wrote:

My question is : should we keep on with ./configure ; make ; make install 
or try to do everything in setup.py ?




As long as we can put setup.py in a Makefile. ;)

Seriously though, ./configure --help; ./configure; make; make install; is 
just such second nature to me I never even thought of the alternative.


Yep, and I think setup.py isn't smart enough to deal with the Apache side 
of mod_python which is C-centric. But in any event, I think ./configure is 
the way to go.


Grisha


Re: _apache._global_lock theory

2005-08-09 Thread Daniel Popowich

Jim Gallacher writes:
 Daniel Popowich wrote:
  The recent discussion of max locks and deadlocking issues with
  _apache._global_(un)?lock() are timely for me:
  
  I'm in the middle of writing a caching module for mod_python servlets
  so a developer can have the output of a servlet cached, keyed on the
  hash of the URI, for future requests.  The goal, of course, is to
  increase throughput for dynamic pages with long shelf life (e.g.,
  content manager updates database once per day, so the HTML only needs
  to be generated once per day, not per request).
  
  I need locking.  My first gut instinct was to go to the fcntl module
  (lockf), but this is only available for unix.  My second gut instinct
  was: so what?  :-)
  
  Wanting a true x-platform tool I then thought of mod_python.Session
  needing locks, poked around the code, and saw how it uses
  _apache._global_lock().  Further poking around showed me how psp code
  caching uses the function.  About the time I began worrying about
  deadlock issues I found the thread on this list discussing the same
  problems.
  
  The solution for Session and psp code caching is to explicitly use lock
  id 0.  This works as long as a module does not hold the lock for the
  whole request, but unlocks immediately after acquiring the needed
  resource.  Fine.
 
 Just to be clear, sessions use the locks above index 0 for session 
 locking. The session id is hashed to determine which index is used. 
 DbmSession uses lock index 0 to lock the dbm file for reading and 
 writing the persistent session data. This is independent of the session 
 lock.

Right, not sessions proper, but by the mod_python.Session module for
dbm file reading.

  So, my question: is this the recommended way for mod_python framework
  developers to acquire x-platform global locks?  Explicitly use lock
  id 0?  If so, is this a secret or should it be documented?
 
 I don't know if it's recommended, but I don't see a problem as long as 
 the lock is held briefly and you make sure you unlock it when you are 
 done. I suspect it is undocumented because it was never documented as 
 opposed to some larger conspiracy.

I guess I was too tongue-in-cheek...my question is: Is it not
documented on *purpose*?  Perhaps it should be documented for internal
developers and framework developers?

 I used another cross platform approach in filesession_cleanup() in
 Session.py.  I wanted to make sure only one request at a time was
 running the cleanup, and used the os.open() call to exclusively open
 a guard file. (OK, not a guard file, but my brain just went
 blank. Hopefully you get the idea.)

I'm with ya...  :-)

 Here is a code snippet:

Thanks for the code...maybe I'll try both (your code and
_apache._global_lock()) and benchmark my caching code with ab.


Thinking out loud here...wouldn't it be good for mod_python to provide
a facility for global locking based on some key?  By default, the lock
is per interpreter, but optionally per server?  Given the oddities of
python programming within an apache environment, especially a prefork
MPM environment, it seems it would be a most valueable service.  The
Session, psp and 3rd-party locking (e.g. mpservlets) could all share
the same code.


Daniel Popowich
---
http://home.comcast.net/~d.popowich/mpservlets/





Re: _apache._global_lock theory

2005-08-09 Thread Jim Gallacher

Daniel Popowich wrote:

Jim Gallacher writes:


Daniel Popowich wrote:


The recent discussion of max locks and deadlocking issues with
_apache._global_(un)?lock() are timely for me:

I'm in the middle of writing a caching module for mod_python servlets
so a developer can have the output of a servlet cached, keyed on the
hash of the URI, for future requests.  The goal, of course, is to
increase throughput for dynamic pages with long shelf life (e.g.,
content manager updates database once per day, so the HTML only needs
to be generated once per day, not per request).

I need locking.  My first gut instinct was to go to the fcntl module
(lockf), but this is only available for unix.  My second gut instinct
was: so what?  :-)

Wanting a true x-platform tool I then thought of mod_python.Session
needing locks, poked around the code, and saw how it uses
_apache._global_lock().  Further poking around showed me how psp code
caching uses the function.  About the time I began worrying about
deadlock issues I found the thread on this list discussing the same
problems.

The solution for Session and psp code caching is to explicitly use lock
id 0.  This works as long as a module does not hold the lock for the
whole request, but unlocks immediately after acquiring the needed
resource.  Fine.


Just to be clear, sessions use the locks above index 0 for session 
locking. The session id is hashed to determine which index is used. 
DbmSession uses lock index 0 to lock the dbm file for reading and 
writing the persistent session data. This is independent of the session 
lock.



Right, not sessions proper, but by the mod_python.Session module for
dbm file reading.



So, my question: is this the recommended way for mod_python framework
developers to acquire x-platform global locks?  Explicitly use lock
id 0?  If so, is this a secret or should it be documented?


I don't know if it's recommended, but I don't see a problem as long as 
the lock is held briefly and you make sure you unlock it when you are 
done. I suspect it is undocumented because it was never documented as 
opposed to some larger conspiracy.



I guess I was too tongue-in-cheek...my question is: Is it not
documented on *purpose*?  Perhaps it should be documented for internal
developers and framework developers?


No, actually I understood your cheek. I was just too lazy to put in a 
smiley after my comment. I shall correct that now. :) And a winkey for 
good measure. ;)





I used another cross platform approach in filesession_cleanup() in
Session.py.  I wanted to make sure only one request at a time was
running the cleanup, and used the os.open() call to exclusively open
a guard file. (OK, not a guard file, but my brain just went
blank. Hopefully you get the idea.)



I'm with ya...  :-)



Here is a code snippet:



Thanks for the code...maybe I'll try both (your code and
_apache._global_lock()) and benchmark my caching code with ab.


Thinking out loud here...wouldn't it be good for mod_python to provide
a facility for global locking based on some key?  By default, the lock
is per interpreter, but optionally per server?  Given the oddities of
python programming within an apache environment, especially a prefork
MPM environment, it seems it would be a most valueable service.  The
Session, psp and 3rd-party locking (e.g. mpservlets) could all share
the same code.



That discussion will have to wait for another time. Time to call it a day.

Regards,
Jim