Re: Worrying code in mod_python.publisher module importer.
On 02/02/2006, at 5:54 PM, Nicolas Lehuen wrote: Having read your work on Vampire (and its module importing mechanism) I'm pretty sure it won't be long. The new importer is actually a complete rewrite and some things are done quite differently to what was done in Vampire. I have in effect rewritten most of what is currently in the mod_python.apache module, which is why you kept seeing me posting all manner of wierd problems in JIRA of late. You may be happy to know though that only a one line change is required in your current incarnation of mod_python.publisher. Namely: module = page_cache[req] gets changed to: module = apache.import_module(req.filename) That may even tell you something about what I will be proposing. And yes it is backwards compatible. I can see where you're going. Is it a module cache separate from sys.modules like I wrote in mod_python.publisher. Yes. but with a backwards compatible interface through apache.import_module ? Yes. First argument can be a full pathname, which supports the way that mod_python.publisher works, or it can be a module name as before, in which case a search is made in appropriate directories relevant to context of the request being handled. If still can't be found, will pass it off completely to bog standard Python import which will search sys.path. Plus module dependencies management like in Vampire ? Yes. And much more which will hopefully make it easier to use and avoid PythonPath/sys.path pollution, avoid having to have fixed absolute paths in Apache configuration files and avoid problems with mixing in of import statement. I really want to do some documentation before unleashing it though, with all the reasoning behind why things are done certain ways. To get to that point though, I have first been wanting to log JIRA entries for all the significant issues so reference can be made to them in part as justification. Thus, I do have a plan, I just need the time. :-) Graham
Re: Python 2.2 support
I am hereby happy to tell you that by removing the call to enumerate() in the publisher code, the whole test suite passes on Python 2.2 without any further patch or hack. I've checked in the modification which this time should not pose any problem since it's pretty basic and non intrusive. Regards, Nicolas 2006/2/2, Graham Dumpleton [EMAIL PROTECTED]: On 02/02/2006, at 5:42 PM, Nicolas Lehuen wrote: That's it ! People with Python 2.2 could use PythonImport mod_python.python22 INTERPRETER_NAME in their configuration file to make sure mod_python supports Python 2.2. The only problem is the need to provide an interpreter name, which complicates things a little bit in the case of the test suite. Then again, the only thing which prevents Python 2.2 support right now is the use of enumerate(), so we could just check whether we could do without enumerate() and support Python 2.2 out of the box. The code didn't used to use enumerate() in that one place and it worked then. Just means having to have a counter that is manually incremented. Don't have access to code right now, else would post changed code. :-) Graham
Re: Python 2.2 support
Jim Gallacher writes: Daniel J. Popowich wrote: Regardless, I do not think it is within the scope of mod_python developers to keep users forward-compatible with the underlying python version. Sorry, but IMHO, this is not scalable software engineering. I'll re-read this paragraph after a good sleep, but right now I'm not sure if you mean we shouldn't be using features only available in newer python versions, or we shouldn't worry about compatibilty with with older python versions? AND... Nicolas Lehuen writes: Daniel : I think that from python22 import * is no more troublesome as the from __future__ import generators that you have to put in any source code that feature generators and needs to run from Python 2.2 and upward. But that's a matter of taste. I can reply to both in one swell foop: I mean we cannot overly concern ourselves with earlier versions of python. I say overly because I do believe, very strongly, that we need to say definitively this-version-of-mod_python is dependent on this-version-of-python-and-above. So, we need to be careful when making such declarations, but once agreed upon and published, then we need not worry ourselves about previous versions. Any hacks of the sort I recommended (PythonImport based ones) are out of the mainstream and published in the FAQ or as examples in the source tarball, but not included in the installed library. Why not in the official library? Because of the scalability of such solutions; actually the lack of scalability. I disagree that 'from python22 import *' (or PythonImport hacks) are no more troublesome than 'from __future__ import XYZ' because it all depends on the point of view of the programmer. Of course, from the perspective of the 2.2 user there is no difference: it's just one import, but that's not our perspective...we're not explicitly 2.2 users...we're developers of a platform based on a language. If we offer 2.2 backwards compatibility we have to KNOW with certainty we've covered our 2.3 tracks. Personally, I want to develop mod_python, not reimplement 2.3 in 2.2. If I'm programming in a 2.2 universe and the publishers of the language have said in 2.3 such-and-such a feature will be made available and you can use it now with 'from __future__' that is very useable and scalable. The list of features is small and presumably the interoperabilites well tested. When 2.3 comes out and I adopt its use, my code doesn't change. If I'm programming in a 2.3 universe and I'm worried about users of my software who use previous versions of 2.2 I have to know, definitively, every feature of 2.3 I'm using not in 2.2. Use of enumerate is easy to find. But there are WAY more diffs than enumerate and do we know for a fact our test cases exercise everything? I ran the following command inside the 2.3 library documentation directory: % grep 'New in version 2\.3' *.html|wc -l 148 Are we sure we're not using any of the 148? Of course, it gets worse as time goes on. The diffs between 2.4 and 2.2? Whew. It may be painful, but as developers of mod_python we HAVE to know what version of python it is based on and we can use only that version to code, BUT have to test ALL versions higher. I.e., if we say mod_python is based on python 2.2 and higher then we can only develop in 2.2, but test 2.2, 2.3 and 2.4. If we say it's based on 2.3 then we develop with 2.3 and test 2.3 and 2.4, etc. Following this logic it makes sense to keep moving forward with python (else 2.2 code begins to break because it's using deprecated code in future versions and/or our testing becomes very painful as we need to certify our old code against newer versions). My gut says any major release of mod_python be based on one major.minor release lower than the currently available python. So, mod_python 3.2 is based on python 2.3; mod_python 3.3 will probably be based on python 2.4 (because 2.5 will be out by then). Cheers, Daniel Popowich --- http://home.comcast.net/~d.popowich/mpservlets/ PS If it's not obvious I'm gearing up to get way more involved...I've been waiting (patiently) for 3.2 to be released and jump in with new 3.3 development...I guess I'm chomping at the bit...
Re: 3.2.6 or not
My official vote is eventually -1 for 3.2.6, see the previous discussion for why I've changed my mind. However I'm +1 on releasing 3.2.7 without a restrained testing period, not a long one like for 3.2.6. Regards, Nicolas 2006/2/2, Jim Gallacher [EMAIL PROTECTED]: I know you said no discussion Grisha, but can I have 2 ballots? ;) -1 If Graham thinks his conn handler fix is good, let's do 3.2.7 today. +1 If Graham is not sure, we release 3.2.6 now as is, and do a 3.2.7 bugfix in the next 4 to 6 weeks after digging into _conn_read issue further. So, I guess that makes my official vote a +0. Over to you Graham. No pressure though. :) Jim (Dang, it makes me feel dirty to waffle on my first offical vote that way). Gregory (Grisha) Trubetskoy wrote: OK, I know we've had some votes on this before, but I'd like to put this in a separate thread where it's not intermixed with all kinds of other things. This is a vote for the core group. We can release the 3.2.6 tarball as is or fix the connection handler bugs (there are two of them - the buffer pointer and eagain condition Graham tracked down) and release a 3.2.7 (or 3.2.6.1). The rationale for disregarding those known issues is that the connection handler is hardly used by anyone. The rationale for NOT disregarding is that we claim this to be a stable release, and given our slow release cycle, I imagine 3.2.6 will be around for a while. Anyhow - *the core group* (you know who you are), if you think 3.2.6 should be released as is, send in your +1. Let's keep this thread strictly a vote, without it turning into a discussion (we can discuss things in other threads). My official vote is +0. (To see what this means read http://httpd.apache.org/dev/guidelines.html) Grisha
Re: svn commit: r374257 - in /httpd/mod_python/trunk: lib/python/mod_python/cache.py test/test.py
I'm getting a unit test failure. FAIL: test_publisher_cache (__main__.PerRequestTestCase) -- Traceback (most recent call last): File test.py, line 1836, in test_publisher_cache self.fail( File /usr/lib/python2.3/unittest.py, line 270, in fail raise self.failureException, msg AssertionError: The publisher cache has reloaded a published module even though it wasn't modified ! Although it's not related to the failure I'd avoid the use of time.clock() in the test function as the behaviour is different on Windows and UNIX, which always makes me nervous. I'd prefer a simple time.time(). I'm investigating the failure now. Jim [EMAIL PROTECTED] wrote: Author: nlehuen Date: Wed Feb 1 21:17:13 2006 New Revision: 374257 URL: http://svn.apache.org/viewcvs?rev=374257view=rev Log: Changed the mod_python.cache.FileCache.check() method so that it stat() then open() the file, rather than open() it and fstat() it. Added a unit test to check whether the publisher cache is doing his job correctly. Modified: httpd/mod_python/trunk/lib/python/mod_python/cache.py httpd/mod_python/trunk/test/test.py Modified: httpd/mod_python/trunk/lib/python/mod_python/cache.py URL: http://svn.apache.org/viewcvs/httpd/mod_python/trunk/lib/python/mod_python/cache.py?rev=374257r1=374256r2=374257view=diff == --- httpd/mod_python/trunk/lib/python/mod_python/cache.py (original) +++ httpd/mod_python/trunk/lib/python/mod_python/cache.py Wed Feb 1 21:17:13 2006 @@ -23,7 +23,7 @@ # Loads Python 2.2 compatibility module from python22 import * -from os import fstat +from os import stat from time import time, mktime from rfc822 import parsedate from calendar import timegm @@ -254,19 +254,16 @@ self.mode=mode def check(self, key, name, entry): -opened = file(key, self.mode) - -timestamp = fstat(opened.fileno())[-2] +timestamp = stat(key).st_mtime if entry._value is NOT_INITIALIZED: entry._timestamp = timestamp -return opened +return file(key, self.mode) else: if entry._timestamp != timestamp: entry._timestamp = timestamp -return opened +return file(key, self.mode) else: -opened.close() return None def build(self, key, name, opened, entry): @@ -380,7 +377,7 @@ opened.close() class HttpModuleCache(HTTPCache): - A module cache. Give it a file name, it returns a module + A module cache. Give it an HTTP URL, it returns a module which results from the execution of the Python script it contains. This module is not inserted into sys.modules. Modified: httpd/mod_python/trunk/test/test.py URL: http://svn.apache.org/viewcvs/httpd/mod_python/trunk/test/test.py?rev=374257r1=374256r2=374257view=diff == --- httpd/mod_python/trunk/test/test.py (original) +++ httpd/mod_python/trunk/test/test.py Wed Feb 1 21:17:13 2006 @@ -1803,6 +1803,62 @@ if (rsp != test traversable instance ok): self.fail(`rsp`) +def test_publisher_cache_conf(self): +c = VirtualHost(*, +ServerName(test_publisher), +DocumentRoot(DOCUMENT_ROOT), +Directory(DOCUMENT_ROOT, + SetHandler(mod_python), + PythonHandler(mod_python.publisher), + PythonDebug(On))) +return str(c) + +def test_publisher_cache(self): +print \n * Testing mod_python.publisher cache + +def write_published(): +published = file('htdocs/temp.py','wb') +published.write('import time\n') +published.write('LOAD_TIME = time.clock()\n') +published.write('def index(req):\n') +published.write('return OK %f%LOAD_TIME\n') +published.close() + +write_published() +try: +rsp = self.vhost_get(test_publisher, path=/temp.py) + +if not rsp.startswith('OK '): +self.fail(`rsp`) + +rsp2 = self.vhost_get(test_publisher, path=/temp.py) +if rsp != rsp2: +self.fail( +The publisher cache has reloaded a published module + even though it wasn't modified ! +) + +# We wait three seconds to be sure we won't be annoyed +# by any lack of resolution of the stat().st_mtime member. +time.sleep(3) +write_published() + +rsp2 = self.vhost_get(test_publisher,
Enhancements for better content negotiation
I have a bunch of code I was thinking of contributing to mod_python, but would like some opinions before doing so (because I don't know if this is the best place)... Basically I wrote some utility functions which can be used to assist with content negotiation; such as parsing the various Accept-* headers *correctly*. I am in fact using these in production systems now because I do quite a bit of content negotiation. I often see a bunch of example code (or even production code) which does something like, if req.headers_in['Accept'].find('image/xyz') = 0: That is really broken per RFC 2616 (HTTP 1.1)! It totally ignores the quality factors and other rules (see sections 14.1 - 14.4). For example if the Accept header sent was image/xyz;q=0, then in fact this is the user agent saying to NEVER send image/xyz! I think this sloppy way of testing is because the syntax and rules for accept headers is actually fairly complex. So this is a perfect opportunity for reusable functions which assist with this (much like the apache.parse_qs function). Essentially I have functions that: 1. Parse and sort any Accept-* header according to the RFC's BNF 2. Negotiate the best content type 3. Negotiate the best language 4. Negotiate the best encoding 5. Negotiate the best charset It also knows how to handle wildcards, hierarchical language tags, charset name aliases, etc. And you can tell it to ignore the super wildcards like */* too. As a quick example you can do something like acc = req.headers_in['Accept'] ct = acceptable_content_type( acc, ['text/html','application/xhtml+xml'] ) and it will tell you which of the two formats the browser supports/prefers, if either, according to all the complex rules in the RFC. I have some preliminary questions: * Is this something that seems useful to others? * Is mod_python util a preferable place to consider putting these, or maybe this should perhaps go to something larger like WSGI or some other Web-SIG project? -- Deron Meranda
Version 3.3 and beyond .......
Daniel J. Popowich wrote .. PS If it's not obvious I'm gearing up to get way more involved...I've been waiting (patiently) for 3.2 to be released and jump in with new 3.3 development...I guess I'm chomping at the bit... We probably want to defer until after 3.2.7 (final) is released to have any serious discussion about what should constitute version 3.3, but am still curious to know at this point where your interests in 3.3 lie. Is it simply to help finish up eliminating all these known issues/bugs or do you have other ideas in mind as to the direction of mod_python? Graham
[jira] Created: (MODPYTHON-118) Allow PythonImport to optionally call function in module.
Allow PythonImport to optionally call function in module. - Key: MODPYTHON-118 URL: http://issues.apache.org/jira/browse/MODPYTHON-118 Project: mod_python Type: Wish Components: core Versions: 3.3 Reporter: Graham Dumpleton PythonImport can currently be used to specify that a module be imported into a named interpreter at the time that an Apache child process is initiated. Because all it does is import the module, if any specific action needs to be triggered, it has to be done as a side effect of the module import. Triggering actions as a side effect of a module import is generally not a good idea as failure of the side effect action will cause the import of the module itself to fail if the code doesn't properly handle this situation. It is generally preferable to import the module and when that has suceeded only then call a specific function contained in the module to initiate the action. Thus proposed that PythonImport be able to take an optional function to be called upon successful import of the name module. The syntax would be like that for Python*Handler directives. PythonImport mymodule::myfunc myinterpreter This would have the effect of loading module mymodule in the interpreter called myinterpreter and then calling mymodule.myfunc(). No arguments would be supplied to the function when called. Another benefit of this feature would be that it would allow a single module to be able to contain a number of special initialisation functions that might be triggerable. The user could selectively call those that might be required. PythonImport mymodule::enable_caching myinterpreter PythonImport mymodule::disable_logging myinterpreter At the moment to do that, a distinct module would need to be created for each where the only thing in the module is the call of the function. Note that in using something similar to mod_python option/config values, am talking here about options that must be able to only be enabled/disable in one spot. The problem with mod_python option/config values in Apache is that different parts of the document tree can set them to different values, which for some things is actually a problem, such as the case with PythonAutoReload. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Version 3.3 and beyond .......
Very interesting. I'll only comment on one issue right now. Daniel J. Popowich wrote .. o And...no suprise...I'd like to try to sell mpservlets for inclusion in the main distro. No tears if it's not, but I think it fills a void and I'd like to make a case for its inclusion. I concur that a servlet based approach is certainly just as viable an approach as any other. I also know that the opinion of some in the past (no names) has been against adding more higher level framework like layers. Personally I feel there is a sort of middle ground and I have my own grand plan whereby a more componentised approach to handlers may mean that supporting a servlet based approach is a logical extension to that idea. Although this may still mean that adding in mpservlets as is might not be appropriate, it wouldn't mean that one could not provide an equivalent servlet like handler component which would achieve the same end. Thus your experience from implementing and using mpservlets could be very valuable in the context of what I have in mind. Thus am sure I will have discussions about all this again at some point. Anyway, my ideas can't come to fruition until the module importing system is fixed. Thus, one step at a time, the first being to get 3.2.7 over and done with. :-) Thanks for the detailed feedback. Graham
Re: svn commit: r374257 - in /httpd/mod_python/trunk: lib/python/mod_python/cache.pytest/test.py
Jim Gallacher wrote: Graham Dumpleton wrote: Jim Gallacher wrote .. I'm getting a unit test failure. FAIL: test_publisher_cache (__main__.PerRequestTestCase) -- Traceback (most recent call last): File test.py, line 1836, in test_publisher_cache self.fail( File /usr/lib/python2.3/unittest.py, line 270, in fail raise self.failureException, msg AssertionError: The publisher cache has reloaded a published module even though it wasn't modified ! Although it's not related to the failure I'd avoid the use of time.clock() in the test function as the behaviour is different on Windows and UNIX, which always makes me nervous. I'd prefer a simple time.time(). It isn't time.time(). I wasn't suggesting time.clock() was the problem, just that I'd prefer any function we use in the unit tests has identical behaviour on all platforms, at least to the extent that is possible. It is because you probably have a prefork/worker MPM. The test as written will only reliably work for winnt MPM. Doh! Prefork bites us in the a** yet again. :) On UNIX boxes the subsequent requests could be handled by a different child process. The configuration as to how many servers to start is: IfModule(prefork.c, StartServers(3), MaxSpareServers(1)), IfModule(worker.c, StartServers(2), MaxClients(6), MinSpareThreads(1), MaxSpareThreads(1), ThreadsPerChild(3), MaxRequestsPerChild(0)), Does that make sense, or did I miss something. Yes, that makes sense. Testing it now. I can't seem to get the publisher_cache test to work for mpm-prefork, and I'm thinking it may not be possible to do so for the test as it's currently conceived. I don't see any way that we can guarantee that the same child process will serve each request in this test. Perhaps someone else can take a look before my head explodes. Jim
Re: svn commit: r374257 - in /httpd/mod_python/trunk: lib/python/mod_python/cache.pytest/test.py
Graham Dumpleton wrote: Jim Gallacher wrote .. It is because you probably have a prefork/worker MPM. The test as written will only reliably work for winnt MPM. Doh! Prefork bites us in the a** yet again. :) On UNIX boxes the subsequent requests could be handled by a different child process. The configuration as to how many servers to start is: IfModule(prefork.c, StartServers(3), MaxSpareServers(1)), IfModule(worker.c, StartServers(2), MaxClients(6), MinSpareThreads(1), MaxSpareThreads(1), ThreadsPerChild(3), MaxRequestsPerChild(0)), Does that make sense, or did I miss something. Yes, that makes sense. Testing it now. I can't seem to get the publisher_cache test to work for mpm-prefork, and I'm thinking it may not be possible to do so for the test as it's currently conceived. I don't see any way that we can guarantee that the same child process will serve each request in this test. Perhaps someone else can take a look before my head explodes. Bar run httpd in single process mode, ie., -X / -DONE_PROCESS options, am not sure how you could get close to a reliable test for all configurations. I already figured it was too hard and why I suggested that the test simply be removed, or at least be disabled for now until we have a better idea. I missed that, so I guess I need to pay more attention. I think at this point disabling it make the most sense. Doing so now. Jim