Re: MaxRequestsPerChild; which request am I?
On Fri, 4 Apr 2003, Brian Reichert wrote: In messing with Apache 1.x, is there a way, via mod-perl, of a request knowing how many requests have been served by the current child? $request++; That's what I do in some handler, and then I log it along with the PID. Eh? I'm confused. What is '$request' in that example? If you mean it's the request object, then that doesn't do what I expect. No, it's a simple counter. It's just a variable in some module that counts requests. -- Bill Moseley [EMAIL PROTECTED]
Re: MaxRequestsPerChild; which request am I?
On Fri, 4 Apr 2003, Brian Reichert wrote: Dunno if someone has a good answer, or a suggestion of a better forum for this: Apache has a configuration directive: MaxRequestsPerChild http://httpd.apache.org/docs/mod/core.html#maxrequestsperchild In messing with Apache 1.x, is there a way, via mod-perl, of a request knowing how many requests have been served by the current child? $request++; That's what I do in some handler, and then I log it along with the PID. -- Bill Moseley [EMAIL PROTECTED]
Re: Basic Auth logout
On Fri, 7 Mar 2003, Francesc Guasch wrote: this has been asked before, and I've found in the archives there is no way I could have a logout page for the Basic Auth in apache. Is there nothing I can do ? This is required only for the development team, so we need to let mozilla or IE forget about the username and password. It all depends on the browser and version. I have been able to logout some versions of IE by having a link to another protected resource of the same auth name but different username and password (in the link). You are just better maintaining a session on the server. -- Bill Moseley [EMAIL PROTECTED]
Re: Authorization question
On Thu, 27 Feb 2003, Perrin Harkins wrote: Jean-Michel Hiver wrote: Yes, but you're then making the authorization layer inseparable from your applicative layer, and hence you loose the interest of using separate handlers. It's pretty hard to truly separate these things. Nobody wants to use basic auth, which means there is a need for forms and handlers. Then you have to keep that information in either cookies or URLs, and there is usually a need to talk to an external data database with a site-specific schema. The result is that plug and play auth schemes only work (unmodified) for the simplest sites. Anyone using PubCookie? http://www.washington.edu/pubcookie/ -- Bill Moseley [EMAIL PROTECTED]
Is Sys::Signal still needed?
Searching the archives I don't see much discusson of Sys::Signal. Is it still needed to restore sig handlers? Thanks, -- Bill Moseley [EMAIL PROTECTED]
Re: web link broken when access cgi-bin
On Sunday 22 December 2002 03:49, Ged Haywood wrote: Hi there, On Sat, 21 Dec 2002, eric lin wrote: The image file:///home/enduser/mytest.jpg cannot be displayed, because it contains errors I think I understand your question but I am not sure of it. It seems that you have sent a request to Apache, received a response, And sent messages about using Windows to a Linux list, and CGI questions to mod_perl list and seems to ignore the many requests to read some basic CGI tutorials. I'd guess troll if he wasn't so clueless. ;)
Re: web link broken when access cgi-bin
On Sun, 22 Dec 2002, Richard Clarke wrote: And sent messages about using Windows to a Linux list, and CGI questions to mod_perl list and seems to ignore the many requests to read some basic CGI tutorials. I'd guess troll if he wasn't so clueless. ;) Since when did mod_perl becomes Linux only? oops, I meant to write: And sent messages about using Windows to a Linux list -- Bill Moseley [EMAIL PROTECTED]
Re: Fw: OT - Santa uses PERL
At 11:17 AM 12/20/02 +0200, Issac Goldstand wrote: >>>> http://www.perl.com/pub/a/2002/12/18/hohoho.html>http://www.perl.com/pub/a/2002/12/18/hohoho.html That sounds a lot like Perrin's story. Didn't he save Christmas one year? -- Bill Moseley mailto:[EMAIL PROTECTED]
Can't get nested files to work in Perl section
mod_perl 1.27 / httpd 1.3.27 In the perl httpd.conf below test.cgi is returned as the default type, text/plain, where test2.cgi is run as a CGI script. Do I have this setup incorrectly? In a standard httpd.conf file it's allowable to have files nested within directory, of course. Perl #!perl $User = 'nobody'; $Group = 'users'; $ServerRoot = '/home/moseley/test'; $TypesConfig = '/dev/null'; $Listen = '*:8000'; $VirtualHost{'*:8000'} = { ServerName = 'foo', DocumentRoot = '/home/moseley/test', ErrorLog = 'logs/error_log.8000', TransferLog = 'logs/error_log.8000', Files = { 'test2.cgi' = { Options = '+ExecCGI', SetHandler = 'cgi-script', }, }, Directory = { '/home/moseley/test' = { Allow = 'from all', Files = { 'test.cgi' = { Options = '+ExecCGI', SetHandler = 'cgi-script', }, }, }, }, }; __END__ -- Bill Moseley mailto:[EMAIL PROTECTED]
[OT] Ideas for limiting form submissions
I've got a mod_perl feed-back form that sends mail to a specific address.. Spammers have their bots hitting the form now. The tricks I know of are: - generate a random image of numbers and make the user type in the numbers on the form. Painful for the user and spammers probably have OCR! - require an email and send a confirmation email (like a list subscription) and whitelist some email addresses. But we want to allow anonymous submissions. - limit submissions by IP number to one every X minutes. AOL users may get blocked. - md5 the submission and block duplicates (should do this anyway). BTW -- what would you recommend for caching the md5 strings. Cache::Cache or DBM? I suppose a Cache::Cache file cache would be the easiest. Any other ideas on the easy to implement side? -- Bill Moseley [EMAIL PROTECTED]
Re: [OT] Ideas for limiting form submissions
At 02:51 PM 12/18/02 -0500, Daniel Koch wrote: Check out Gimpy, which I believe is what Yahoo uses: http://www.captcha.net/captchas/gimpy/ I'm thinking of something along those lines. This problem is this is on Solaris 2.6 w/o root, and I'll bet it would take some time to get The Gimp and GTK and whatever libs installed. So, I'm thinking about creating a directory of say 20 images of words. On the initial request the form creates a random key, and makes that a symlink to one of the images selected at random. That will be the img src link. Then md5 the symlink with a secret word to create a hidden field. The submitter will have to type in the word displayed in the image. On submit md5 all the symlinks with the secret word until a match is found -- match the submitted word text with the real image name, then unlink the symlink and accept the request. Cron can remove old symlinks. If the spammers put in the work to figure out the word by check-summing the images I can use imagemagic to modify the images -- that could be a nice mod_perl handler. See any glaring holes? -- Bill Moseley mailto:[EMAIL PROTECTED]
RE: Cookie-free authentication
On Sat, 14 Dec 2002, Ron Savage wrote: Under Apache V 1/Perl 5.6.0 I could not get the Apache::AuthCookieURL option working which munged URLs without requiring cookies. I thought the problem was that Apache::AuthCookie was redirecting to your login scrip on logout instead of displaying your logout page. -- Bill Moseley [EMAIL PROTECTED]
Re: Yahoo is moving to PHP ??
At 02:50 PM 10/30/02 -0500, Perrin Harkins wrote: Mithun Bhattacharya wrote: No it is not being removed but this could have been a very big thing for mod_perl. Can someone find out more details as to why PHP was preferred over mod_perl it cant be just on a whim. Think about what they are using it for. Yahoo is the most extreme example of a performance-driven situation. I also wonder if it's cheaper/easier to hire and train PHP programmers that Perl programmers. -- Bill Moseley mailto:moseley;hank.org
RE: [OTish] Version Control?
At 04:47 PM 10/30/02 -0500, Jesse Erlbaum wrote: Web development projects can map very nicely into CVS. We have a very mature layout for all web projects. In a nutshell, it boils down to this: project/ + apache/ + bin/ That requires binary compatibility, though. I have a similar setup, but the perl and Apache are built separately on the target machine since my machines are linux and the production machine is Solaris. I only work on single servers, so things are a bit easier. I always cvs co to a new directory on the production machine and start up a second set of servers on high ports. That lets me (and the client) test on the final platform before going live. Then it's apache stop mv live old mv new live apache start kind of thing, which is a fast way to update. I'd love to have the Perl modules in cvs, though. Especially mod_perl modules. It makes me nervous upgrading mod_perl on the live machine's perl library. Should make more use of PREFIX, I suppose. Speaking of cvs, here's a thread branch: I have some client admin features that they update via web forms -- some small amount of content, templates, and text-based config settings. I currently log a history of changes, but it doesn't have all the features of cvs. Is anyone using cvs to manage updates made with web-based forms? -- Bill Moseley mailto:moseley;hank.org
Re: [OTish] Version Control?
At 03:21 PM 10/30/02 -0800, [EMAIL PROTECTED] wrote: We check in all of our perl modules into CVS and its a _MAJOR_ life saver. Keeps everyone on the same path so to speak. I think I confused two different things: perl module source vs. installed modules. Do you check in the source or the installed modules? I keep the source of my perl modules under cvs, but not the perl library i.e. the files generated from make install, which might include binary components. I use a PREFIX for my own modules, but I tend to install CPAN modules in the main perl library. My own modules get installed in the application directory tree so that there's still a top level directory for the entire application/site. It does worry me that I'll update a CPAN module (or Apache::*) in the main Perl library and break something some day. (Although on things like updating mod_perl I have copied /usr/local/lib/perl5 before make install.) -- Bill Moseley mailto:moseley;hank.org
mod_perl-based registration programs?
Before I start rewriting... Anyone know of a mod_perl based program for registering people for events? The existing system allows people to sign up and cancel for classes and workshops that are offered at various locations and also for on-line classes. We have a collection of training workshops that are each offered a number of times a year, and are taught by a pool of instructors. Typically a few classes a week. It mails reminders a few days before the classes, sends class lists to the assigned instructor before their class, and normal database stuff for displaying, searching and reporting. Currently, billing is by invoice, but we would like an on-line payment option. Anyone know of something similar? Thanks, Bill Moseley mailto:[EMAIL PROTECTED]
Re: Logging under CGI
At 10:30 PM 06/10/02 -0400, Sam Tregar wrote: On Tue, 11 Jun 2002, Sergey Rusakov wrote: open(ERRORLOG, '/var/log/my_log'); print ERRORLOG some text\n; close ERRORLOG; This bit of code runs in every apache child. I worry abount concurent access to this log file under heavy apache load. Is there any problems on my way? You are correct to worry. You should use flock() to prevent your log file from becoming corrupted. See perldoc -f flock() for more details. Maybe it's a matter of volume. Or size of string written to the log. But I don't flock, and I keep the log file open between requests and only reopen if stat() shows that the file was renamed. So far been lucky. -- Bill Moseley mailto:[EMAIL PROTECTED]
RE: [OT] MVC soup (was: separating C from V in MVC)
At 12:13 PM 06/08/02 +0100, Jeff wrote: The responsibility of the Controller is to take all the supplied user input, translate it into the correct format, and pass it to the Model, and watch what happens. The Model will decide if the instruction can be realised, or if the system should explode. I'd like to ask a bit more specific question about this. Really two questions. One about abstracting input, and, a bit mundane, building links from data set in the model. I've gone full circle on handling user input. I used to try to abstract CGI input data into some type of request object that was then passed onto the models. But then the code to create the request object ended up needing to know too much about the model. For example, say for a database query the controller can see that there's a query parameter and thus knows to pass the request to the code that knows how to query the database. That code passes back a results object which then the controller can look at to decide if it should display the results, a no results page and/or the query form again. Now, what happens is that features are added to the query code. Let's say we get a brilliant idea that search results should be shown a page at a time (or did Amazon patent that?). So now we want to pass in the query, starting result, and the page size. What I didn't like about this is I then had to adjust the so-called controller code that decoded the user input for my request object to include these new features. But really that data was of only interest to the model. So a change in the model forced a change in the controller. So now I just have been passing in an object which has a param() method (which, lately I've been using a CGI object instead of an Apache::Request) so the model can have full access to all the user input. It bugs me a bit because it feels like the model now has intimate access to the user input. And for things like cron I just emulate the CGI environment. So my question is: Is that a reasonable approach? My second, reasonably unrelated question is this: I often need to make links back to a page, such as a link for page next. I like to build links in the view, keeping the HTML out of the model if possible. But for something like a page next link that might contain a bunch of parameters it would seem best to build href in the model that knows about all those parameters. Anyone have a good way of dealing with this? Thanks, P.S. and thanks for the discussion so far. It's been very interesting. -- Bill Moseley mailto:[EMAIL PROTECTED]
[OT] MVC soup (was: separating C from V in MVC)
I, like many, find these discussion really interesting. I always wish there was some write up for the mod_perl site when all was said and done. But I guess one of the reasons it's so interesting is that there's more than one correct point of view. My MVC efforts often fall apart in the C an M separation. My M parts end up knowing too much about each other -- typically because of error conditions e.g. data that's passed to an M that does not validate. And I don't want to validate too much data in the C as the C ends up doing M's work. Anyone have links to examples of MVC Perl code (mostly controller code) that does a good job of M and C separation, and good ways to propagate errors back to the C? -- Bill Moseley mailto:[EMAIL PROTECTED]
Throttling, once again
Hi, Wasn't there just a thread on throttling a few weeks ago? I had a machine hit hard yesterday with a spider that ignored robots.txt. Load average was over 90 on a dual CPU Enterprise 3500 running Solaris 2.6. It's a mod_perl server, but has a few CGI scripts that it handles, and the spider was hitting one of the CGI scripts over and over. They were valid requests, but coming in faster than they were going out. Under normal usage the CGI scripts are only accessed a few times a day, so it's not much of a problem have them served by mod_perl. And under normal peak loads RAM is not a problem. The machine also has bandwidth limitation (packet shaper is used to share the bandwidth). That combined with the spider didn't help things. Luckily there's 4GB so even at a load average of 90 it wasn't really swapping much. (Well not when I caught it, anyway). This spider was using the same IP for all requests. Anyway, I remember Randal's Stonehenge::Throttle discussed not too long ago. That seems to address this kind of problem. Is there anything else to look into? Since the front-end is mod_perl, it mean I can use mod_perl throttling solution, too, which is cool. I realize there's some fundamental hardware issues to solve, but if I can just keep the spiders from flooding the machine then the machine is getting by ok. Also, does anyone have suggestions for testing once throttling is in place? I don't want to start cutting off the good customers, but I do want to get an idea how it acts under load. ab to the rescue, I suppose. Thanks much, -- Bill Moseley mailto:[EMAIL PROTECTED]
RE: mod_perl Cook Book
At 09:59 AM 04/06/02 +0100, Phil Dobbin wrote: It's definitely the book to buy _before_ the Eagle book. No, buy both at the same time. I think the Eagle gives a really good foundation, and it's very enjoyable reading (regardless of what my wife says!). I still think the Eagle book is one of the best books on my bookshelf. I have a couple of Apache-specific books and I learned a lot more about Apache from the Eagle than those. The cook book has been a great addition. -- Bill Moseley mailto:[EMAIL PROTECTED]
Re: Creating a proxy using mod_perl
At 05:11 PM 3/15/2002 +0300, Igor Sysoev wrote: On Fri, 15 Mar 2002, Marius Kjeldahl wrote: I guess these all suffer from the fact that the parameters have to be specified in httpd.conf, which makes it impossible to pass a url to fetch from in a parameter, right? So mod_rewite with mod_proxy or mod_accel: RewriteRule /proxy_url=http://(.+)$http://$1 [L,P] Note that 'proxy?url=' is changed to 'proxy_url='. Any concern about being an open proxy there? I'd want to only proxy the sites I'm working with. I'd rather cache the images locally, just in case you are working with a slow site or if they do something silly like check referer on requests. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [ANNOUNCE] The New mod_perl logo - results now in...
At 04:33 PM 03/15/02 -0500, Georgy Vladimirov wrote: I actually like the logo without the underscore. I don't think an underscore is very collaborative with art. The _ has always been irritating me a little. I know that there is history and nostalgia involved here but dropping an underscore at least in the logo is a nice evolution IMHO. I also agree with this, and is one of the reasons (I think) I voted for that design. It's a graphic design so I don't see that it needs to follow the Apache module naming convention exactly. Nor perl identifier names, either. Many of the designs offered didn't use the underscore as well. And the design that won didn't use one. It's a design -- it doesn't have to be accurate to the name. Besides, if it changes does it mean that the winning design received no votes? ;) -- Bill Moseley mailto:[EMAIL PROTECTED]
[WOT] Google Programming Contest.
Sorry for the Way Off Topic, and sorry if I missed this on the list already: http://www.google.com/programming-contest/ They say C++ or Java. What, no Perl? -- Bill Moseley mailto:[EMAIL PROTECTED]
Re: New mod_perl Logo
At 07:29 PM 01/29/02 -0500, Chris Thompson wrote: Well, I'd like to just throw one idea into the mix. It's something that's bugged me for a long time, no better time than the present. mod_perl is a lousy name. I don't know about lousy, but I do agree. I brought this up on the docs-dev list: http://search.apache.org/archives/docs-dev/0236.html During the week I posted that I had run into PHP programmers at a computer show, more PHP programmers at a pub (2 in the afternoon -- more out of work programmers), and ended up talking with a couple of Java programmers one day. The amazing thing was they all had a completely weird idea about what mod_perl is or what it does. And all thought it was slow, old, dead, not scalable, technology. And that was from programmers, not managers. We all know there is a lot of misinformation out there. Marketing is not everything, but it's a lot! What we know of mod_perl is more than just perl+Apache, really. It's a development platform, or development suite. It can be anything our marketing department says it is. ;) In these tough economic times, repackaging might be helpful. Who knows? And for some of us we know that mod_perl is also something that makes up a chunk of our livelihood. So, the promotion of mod_perl is quite important, unless we want to start spending more afternoons with those PHP programmers down at the corner pub. So how would a group like the mod_perl community promote itself in new ways? Well, other professionals often have professional organizations or associations to represent and promote their members. I wonder if there are there enough mod_perl programmers to support something like that. Even if there were, what could be done? Run a few print ads in magazines that system admins read? Hire an ad firm for help in developing our brand? mod_perl coffee mugs? (Tired of that old cup of Java?) Free mod_perl clinics? Hard to imagine any of that actually happening, really. So what's a group of programmers to do? The new web site should help, to some degree, but I'm not sure it will change any manager's mind on the technology they pick to run their applications. Of course, most people here have access to big pipes. So, there's always bulk mail ads. I got mail just today saying that it's an effective way to advertise. In fact I got about ten of those today! -- Bill Moseley mailto:[EMAIL PROTECTED]
Re: META tags added as HTTP headers
At 01:20 AM 01/19/02 +0100, Markus Wichitill wrote: which part of an Apache/mod_perl setup is responsible for extracting META tags from generated HTML and adding them as HTTP headers (even with PerlSendHeaders Off)? That's lwp doing that, not Apache or mod_perl. HEAD http://www.apache.org 200 OK Cache-Control: max-age=86400 Connection: close Date: Sat, 19 Jan 2002 00:27:10 GMT Accept-Ranges: bytes Server: Apache/2.0.28 (Unix) Content-Length: 7810 Content-Type: text/html Expires: Sun, 20 Jan 2002 00:27:10 GMT Client-Date: Sat, 19 Jan 2002 00:27:17 GMT Client-Request-Num: 1 Client-Warning: LWP HTTP/1.1 support is experimental -- Bill Moseley mailto:[EMAIL PROTECTED]
Re: META tags added as HTTP headers
At 04:46 PM 01/18/02 -0800, ___cliff rayman___ wrote: hmmm - you are still using lwp. Right. But lwp-request sends a GET request where HEAD sends, well, a HEAD request. So, even though LWP's default is to parse the head section, there's no content to parse in a HEAD request, and thus the meta headers don't show up. -- Bill Moseley mailto:[EMAIL PROTECTED]
Re: Alarms?
At 06:56 PM 01/10/02 +0300, [EMAIL PROTECTED] wrote: Hello! I'm getting lots of errors in log: [Thu Jan 10 18:54:33 2002] [notice] child pid 8532 exit signal Alarm clock (14) I hope I remember this correctly: What's happening is you are setting a SIGALRM handler in perl, but perl is not correctly restoring Apache's handler when yours goes out of scope. So then a non-mod_perl request times out there's not handler and the process is killed. Check: http://thingy.kcilink.com/modperlguide/debug/Debugging_Signal_Handlers_SIG_.html http://thingy.kcilink.com/modperlguide/debug/Handling_Server_Timeout_Cases_an.html -- Bill Moseley mailto:[EMAIL PROTECTED]
Re: Template-Toolkit performance tuning
At 05:17 PM 12/30/01 -0600, Ryan Thompson wrote: use Template; my %vars; $var{foo} = bar; # About 30 scalars like this . . my $tt = new Template({INTERPOLATE = 1}); Cache your template object between requests. -- Bill Moseley mailto:[EMAIL PROTECTED]
Searchable archives (was: [modperl site design challenge] and the winner is...)
At 02:13 PM 12/24/01 +0800, Stas Bekman wrote: FWIW, we are having what seems to be a very productive discussion at docs-dev mailing list. Unfortunately no mail archiver seem to pick this list up, so only the mbox files are available: http://perl.apache.org/mail/docs-dev/ Is anyone up to make the searchable archives available? We have a bunch of lists that aren't browsable/searchable :( http://perl.apache.org/#maillists Hi Stas, Any reason to not use hypermail? Do you have mbox files for all the lists in question? I could setup searchable archives like this example, if you like. http://search.apache.org/docs-dev/ (this URL is temporary!) Bill Moseley mailto:[EMAIL PROTECTED]
Re: Can I use mod_perl to pass authentication details to apache from an HTML form?
At 08:49 AM 12/24/2001 -, Chris Thompson wrote: I would like to set up a password-protected area within my website where my web design clients can preview changes to their sites before they go live. Although I know how to password protect directories using .htaccess files, I would prefer to bypass the standard grey Authorization pop-up screen and instead let users enter their username / password details through an HTML form (which I think would look more professional). Take a look at Apache::AuthCookie. If possible, the system for authenticating / authorizing the user would also redirect them to the appropriate directory for their site. You can do that, or you can just have them go directly to their area, and have the authentication system intercept the request. This is what Apache::AuthCookie does. You might also look at Apache::AuthCookieURL if there's a chance that your users might not have cookies enabled. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [modperl site design challenge] and the winner is...
I'm throwing in my two cents a bit late, so it's a bit depreciated now (one cent?). But something to think about for the site. I've worked with php a little lately -- not programming, but making minor changes to a site. I've used the php site http://www.php.net/ a few times, and I've found it reasonably functional, but also quite easy for someone new to php. Maybe it seems that way because I know nothing about php and it's geared toward my level. But that's good. How often to the mod_perl pros need to read the mod_perl home page? I'm sure all these elements will be added to the new mod_perl site in some way, but I just wanted to note what I liked about the php site. And I'm not comparing mod_perl to php! What the php site shows in a real obvious way is: 1) what is php (for someone that is brand new) with a link to some basic examples. It demystifies php in a hurry. Makes someone think Oh, I can do that. 2) currently, it's showing Netcraft's usage stats, so I see that people are using it in growing numbers -- it's not a dead-end for a new person to try out. 3) it shows upcoming events. That shows that there's a real support group of real people to work with. Links to discussion lists archives would be good there. All that makes it really easy for someone new to feel comfortable. It would be nice to see license info, too, as someone new might want to be clear on that right away, too. You can also quickly see a list of supported modules. This shows that it's easy to extend, but also allows someone to see that it can do the thing *they* might be interested in. Sure, perl has CPAN, but I think it would be good to show a list of commonly used modules for mod_perl, and what they do, in a simple list. If someone is just learning about mod_perl (or php) the list doesn't need to be that big, as their needs will be reasonably basic. Existing mod_perl (or php?) programmers might not like all that basic, first-time user stuff right on the home page, and would rather have a more functional site. I don't know about anyone else, but I've got the links I need bookmarked, and if not I go to perl.apache.org and ^F right to where I want to go. BTW -- At first I liked David's idea of using the ASF look. That ties mod_perl to apache well. But, if the site is intended to bring in new users, it might be good to be a bit more flashy. crazy idea Maybe as a community (of programmers not designers) we could hire a professional designer to help develop our brand. Cool web site. Some print ads in the trades. What's a small amount in dues to the Association of Mod_perl Programmers compared to increase of mod_perl work overall? /crazy idea Bill Moseley mailto:[EMAIL PROTECTED]
Re: Comparison of different caching schemes
Ok, I'm a bit slow... At 03:05 PM 12/12/01 +1100, Rob Mueller (fastmail) wrote: >>>> Just thought people might be interested... Seems like they were! Thanks again. I didn't see anyone comment about this, but I was a bit surprised by MySQLs good performance. I suppose caching is key. I wonder if things would change with 50 or 100 thousand rows in the table. I always assumed something like Cache::FileCache would have less overhead than a RDMS. It's impressive. >>>> Now to the results, here they are. Package C0 - In process hash Sets per sec = 147116 Gets per sec = 81597 Mixes per sec = 124120 Package C1 - Storable freeze/thaw Sets per sec = 2665 Gets per sec = 6653 Mixes per sec = 3880 Package C2 - Cache::Mmap Sets per sec = 809 Gets per sec = 3235 Mixes per sec = 1261 Package C3 - Cache::FileCache Sets per sec = 393 Gets per sec = 831 Mixes per sec = 401 Package C4 - DBI with freeze/thaw Sets per sec = 651 Gets per sec = 1648 Mixes per sec = 816 Package C5 - DBI (use updates with dup) with freeze/thaw Sets per sec = 657 Gets per sec = 1994 Mixes per sec = 944 Package C6 - MLDBM::Sync::SDBM_File Sets per sec = 334 Gets per sec = 1279 Mixes per sec = 524 Package C7 - Cache::SharedMemoryCache Sets per sec = 42 Gets per sec = 29 Mixes per sec = 32 Bill Moseley mailto:[EMAIL PROTECTED]
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
At 08:19 AM 12/06/01 -0800, Paul Lindner wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? BTW -- I think where the docs are cached should be configurable. I don't like the idea of the document root writable by the web process. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
At 10:33 AM 12/06/01 -0800, Paul Lindner wrote: On Thu, Dec 06, 2001 at 10:04:26AM -0800, Bill Moseley wrote: At 08:19 AM 12/06/01 -0800, Paul Lindner wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? Apache::CacheContent gives you more control over the caching process and keeps the expiration headers from leaking to the browser. Ok, I see. Or maybe you want to dynamically control the TTL? Would you still use it with a front-end lightweight server? Even with caching, a mod_perl server is still used to send the cached file (possibly over 56K modem), right? Bill Moseley mailto:[EMAIL PROTECTED]
Re: Hi
At 05:13 PM 12/04/01 -0500, Robert Landrum wrote: If this guy is going to be sending us shit all night, I suggest we deactivate his account. Now that would be fun! Oh, you mean by unsubscribing him. I was thinking of something more sporting. What's the collective bandwidth of the people on this list? Just kidding. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [OT] log analyzing programs
At 10:09 AM 12/2/2001 +, Matt Sergeant wrote: PID USERNAME THR PRI NICE SIZE RES STATE TIMECPU COMMAND 17223 operator 1 442 747M 745M cpu14 19.2H 45.24% wusage Ouch. Try analog. PID USERNAME THR PRI NICE SIZE RES STATE TIMECPU COMMAND 17223 operator 1 02 747M 745M cpu14 27.1H 47.57% wusage Well at least after another 8 hours of CPU it's not leaking ;) Bill Moseley mailto:[EMAIL PROTECTED]
[OT] log analyzing programs
Any suggestions for favorite ones? wusage seems to require a lot of resources -- maybe that's not unusual? It runs once a week. Here's a about six days worth of requests. Doesn't see like that many. %wc -l access_log 1185619 access_log PID USERNAME THR PRI NICE SIZE RES STATE TIMECPU COMMAND 17223 operator 1 442 747M 745M cpu14 19.2H 45.24% wusage Bill Moseley mailto:[EMAIL PROTECTED]
Re: [OT] Re: search.cpan.org
At 12:55 PM 11/27/01 -0800, Nick Tonkin wrote: Because it does a full text search of all the contents of the DB. Perhaps, but it's just overloaded. I'm sure he's working on it, but anyone want of offer Graham free hosting? A few mirrors would be nice, too. (Plus, all my CPAN.pm setups are now failing to work, too) Bill Moseley mailto:[EMAIL PROTECTED]
Re: [OT] Re: search.cpan.org
At 09:02 PM 11/27/01 +, Mark Maunder wrote: I'm using it on our site and searching fulltext indexes on three fields (including a large text field) in under 3 seconds on over 70,000 records on a p550 with 490 megs of ram. Hi Mark, plug Some day if you are bored, try indexing with swish-e (the development version). http://swish-e.org The big problem with it right now is it doesn't do incremental indexing. One of the developers is trying to get that working with in a few weeks. But for most small sets of files it's not an issue since indexing is so fast. My favorite feature is it can run an external program, such as a perl mbox or html parser or perl spider, or DBI program or whatever to get the source to index. Use it with Cache::Cache and mod_perl and it's nice and fast from page to page of results. Here's indexing only 24,000 files: ./swish-e -c u -i /usr/doc Indexing Data Source: File-System Indexing /usr/doc 270279 unique words indexed. 4 properties sorted. 23840 files indexed. 177638538 total bytes. Elapsed time: 00:03:50 CPU time: 00:03:16 Indexing done! Here's searching: ./swish-e -w install -m 1 # SWISH format: 2.1-dev-24 # Search words: install # Number of hits: 2202 # Search time: 0.006 seconds # Run time: 0.011 seconds A phrase: ./swish-e -w 'public license' -m 1 # SWISH format: 2.1-dev-24 # Search words: public license # Number of hits: 348 # Search time: 0.007 seconds # Run time: 0.012 seconds 998 /usr/doc/packages/ijb/gpl.html gpl.html 26002 A wild card and boolean search: ./swish-e -w 'sa* or java' -m 1 # SWISH format: 2.1-dev-24 # Search words: sa* or java # Number of hits: 7476 # Search time: 0.082 seconds # Run time: 0.087 seconds Or a good number of results: ./swish-e -w 'is or und or run' -m 1 # SWISH format: 2.1-dev-24 # Search words: is or und or run # Number of hits: 14477 # Search time: 0.084 seconds # Run time: 0.089 seconds Or everything: ./swish-e -w 'not dksksks' -m 1 # SWISH format: 2.1-dev-24 # Search words: not dksksks # Number of hits: 23840 # Search time: 0.069 seconds # Run time: 0.074 seconds This is pushing the limit for little old swish, but here's indexing a few more very small xml files (~150 bytes each) 3830016 files indexed. 582898349 total bytes. Elapsed time: 00:48:22 CPU time: 00:44:01 /plug Bill Moseley mailto:[EMAIL PROTECTED]
Re: [modperl-site design challenge]
At 11:14 AM 11/26/01 -0500, John Saylor wrote: * While the design might not be to cool from the designers point of view, I like it because it is simple, doesn't use HTML-tables, is small and fast (/very/ little HTML-overhead) and accessible to disabled people. But that *is* cool. I think it's very well designed. To me, usability is the main design goal. Keep up the good work! Does it need to render well in old browsers? (e.g. netscape 4.08) There's a lot of old browsers out there, but maybe anyone looking at mod_perl would be a bit more up to date... Bill Moseley mailto:[EMAIL PROTECTED]
Re: Apache::Registry HEAD request also return document body
At 11:43 AM 11/23/2001 +, Jean-Michel Hiver wrote: PROBLEM HERE A head request should * NOT * return the body of the document You should check $r-header_only in your handler. http://thingy.kcilink.com/modperlguide/correct_headers/3_1_HEAD.html Bill Moseley mailto:[EMAIL PROTECTED]
Re: Apache::AuthCookie login faliure reason
At 04:09 PM 11/23/2001 +1100, simran wrote: Hi All, I am having some trouble getting Apache::AuthCookie (version 3 which i believe is the latest version) to do what want: What i want is: * To be able to give the user a reson if login fails - eg reason: * No such username * Your password was incorrect Has anyone else come across the same requirement/issue, and how have you solved it? Apache::AuthCookieURL does that. IIRC, it sets a cookie with the failure reason that's returned from authen_cred call. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Apache::Registry HEAD request also return document body
At 02:53 PM 11/23/01 +, Jean-Michel Hiver wrote: My only concern is that I thought that Apache::Registry was designed to act as a CGI emulator, allowing not so badly written CGIs to have mod_perl benefits without having to change them. Right, sorry I completely missed the Registry part! Try HEAD on this script. #!/usr/local/bin/perl -w use CGI; my $q = CGI-new; print $q-header, $q-start_html, join( BR\n, map { $_ : $ENV{$_} } keys %ENV), $q-end_html; If I have to use the $r object (and therefore the Apache module), then it means that the scripts won't be able to run as standalone CGIs... Am I right? Right, maybe that's a good thing ;) (I acutally mix mod_perl code in applicatins that will run under both.) Bill Moseley mailto:[EMAIL PROTECTED]
Re: Apache::Registry HEAD request also return document body
At 03:21 PM 11/23/01 +, Jean-Michel Hiver wrote: Duh... That's a lot of info for a head request :-) Yes, and that's what I get for using HEAD to test! Yesterday's holliday doesn't help todays thinking. How about patching Apache::Registry? Oh, Stas, of course, just posted a better solution. Maybe I'll have better luck repairing my car today. Bill Moseley mailto:[EMAIL PROTECTED]
[OT] Re: Seeking Legal help
At 03:21 PM 11/21/01 -0800, Medi Montaseri wrote: I did some work (about $25000 worth) for a customer and I'm having problem collecting. This has been beaten to death on the list, but... (and I'm not a lawyer, but I drink beer with one), If you think they are going Chapter 11, then you may want to try to bargain down to some amount to get something, so you are not on their list of creditors. When they do file, if that's the case, they have to notify the court of their creditors and then the court is suppose to notify you. You must then file a proof of claim, and get in line with everyone else. If you think they might fail to list you as a creditor when they file, contact the court every few weeks and check if they have already filed, and file your proof of claim. Then at least you might get a penny on the dollar... $25K is a bad number, in that it's too big for small claims court, and it's too little to get much help from lawyers in a law suit, I'd guess. Ask them if they want to pay partially in hardware and you might get a good idea of their direction ;). Good luck, Bill Moseley mailto:[EMAIL PROTECTED]
Re: Cookie authentication
At 02:02 PM 11/15/01 -0600, John Michael wrote: >>>> This may seem off subject but, If you bare with me, I don't think it is. I am interested in using the cookie based system referred to in the programming the apache api book but oftend wonder this. Can you count on everyone to use cookies. Sometime back I wrote a module based on Apache::AuthCookie called Apache::AuthCookieURL that uses cookies, or falls back to munged URLs if cookies were not enabled. It's on CPAN. I wrote it for a site where people come in from public libraries. The requirement was that it had to do sessions even if cookies were disabled (as it was common for the public libraries to have cookies disabled). It's been a while since I looked at it. I had added a way to disable the authen requirement for areas of the site (or everywhere), so it could be used just for dealing with sessions. Do be careful about session hijacking. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Cookie authentication
At 05:20 PM 11/15/01 -0600, John Michael wrote: Thanks. I did not know that you could verify that someone has cookies turned on. Can you point me to where i can find out how to do this? Is there a variable that you can check? You set a cookie and do a redirect (if you need the cookie right away). If it comes back with a cookie then they are enabled. Bill Moseley mailto:[EMAIL PROTECTED]
Re: how to install the XML::LibXSLT along with libxslt?
At 08:03 PM 11/14/01 -0800, SubbaReddy M wrote: Maybe a question for the libxml2 list instead of mod_perl? So, while installing libxslt-1.0.6 i am getting error atlast, that is " checking for libxml libraries >= 2.4.7... ./configure: xml2-config: command not found " Did you make install libxml2? > which xml2-config /usr/local/bin/xml2-config >>>> Bill Moseley mailto:[EMAIL PROTECTED]
[OT] Data store options
Hi, verbose I'm looking for a little discussion on selecting a data storage method, and I'm posting here because Cache::Cache often is discussed here (along with Apache::Session). And people here are smart, of course ;). Basically, I'm trying to understand when to use Cache::Cache, vs. Berkeley DB, and locking issues. (Perrin, I've been curious why at etoys you used Berkeley DB over other caching options, such as Cache::Cache). I think RDBMS is not required as I'm only reading/writing and not doing any kind of selects on the data -- also I could end up doing thousands of selects for a request. So far, performance has been good with the file system store. My specifics are that I have a need to permanently store tens of thousands of smallish (5K) items. I'm currently using a simple file system store, one file per record, all in the same directory. Clearly, I need to move into a directory tree for better performance as the number of files increases. The data is accessed in a few ways: 1) Read/write a single record 2) Read anywhere from a few to thousands of records in a request. This is the typical mod_perl-based request. I know the record IDs that I need to read from another source. I basically need a way to get some subset of records fast, by record ID. 3) Traverse the data store and read every record. I don't need features to automatically expire the records. They are permanent. When reading (item 2) I have to create a perl data structure from the data, which doesn't change. So, I want to store this in my record, using Storable.pm. That can work with any data store, of course. It's not a complicated design. My choices are something like: 1) use Storable and write the files out myself. 2) use Cache::FileCache and have the work done (but can I traverse?) 3) use Berkeley DB (I understand the issues discussed in The Guide) So, what kind of questions and answers would help be weigh the options? With regard to locking, IIRC, Cache::Cache doesn't lock, rather writes go to a temp file, then there's an atomic rename. Last in wins. If updates to a record are not based on previous content (such as a counter file) is there any reason this is not a perfectly good method -- as opposed to flock? Again, I'm really looking more for discussion, not an answer to my specific needs. What issues would you use when selecting a data store method, and why? /verbose Thanks very much, Bill Moseley mailto:[EMAIL PROTECTED]
Re: Cache::* and MD5 collisions [was: [OT] Data store options]
At 10:54 AM 11/08/01 -0800, Andrew Ho wrote: For example, say your keys are e-mail addresses and you just want to use an MD5 hash to spread your data files over directories so that no one directory has too many files in it. Say your original key is [EMAIL PROTECTED] (hex encoded MD5 hash of this is RfbmPiuRLyPGGt3oHBagt). Instead of just storing the key in the file R/Rf/Rfb/Rfbm/RfbmPiuRLyPGGt3oHBagt.dat, store the key in the file [EMAIL PROTECTED] Presto... collisions are impossible. That has the nice side effect that I can run through the directory tree and get the key for every file. I do need a way to read every key in the store. Order is not important. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [OT] search engine module?
At 02:04 PM 10/16/2001 +0100, Ged Haywood wrote: Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. I've had good luck with http://swish-e.org/2.2/ Please make sure that it's possible to do a plain ordinary literal text string search. Nothing fancy, no case-folding, no automatic removal of puctuation, nothing like that. Just a literal string. Last night I tried to find perl -V on all the search engines mentioned on the mod_perl home page and they all failed in various interesting ways. I assume it's how the search engine is configured. Swish, for example, you can define what chars make up a word. Not sure what you mean by literal string. For performance reasons you can't just grep words (or parts of words), so you have to extract out words from the text during indexing. You might define that a dash is ok at the start of a word, but not at the end and to ignore trailing dots, so you could find -V and -V. (at the end of a sentence). Some search engines let you define a set of buzzwords that should be indexed as-is, but that's more helpful for technical writing instead of indexing code. Finally, in swish, if you put something like perl -V in quotes to use a phrase search it will find what you are looking for most likely, even if the dash is not indexed. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Mod_perl component based architecture
I've been looking at OpenInteract, too. I've got a project where about 100 people need to edit records in a database via a web-based interface. And I'd like history tracking of changes (something like CVS provides, where it's easy to see diffs and to back out changes). And I need access control for the 100 people, along with tracking per user of how many changes they make, email notification of changes, administrative and super-user type of user levels, and bla, bla bla, and so on. Normal stuff. I'm just bored with html forms. Seems like I do this kind of project too often -- read a record, post, validate, update... Even with good templating and code reuse between projects I still feel like I spend a lot of time re-inventing the (my) wheel. Will an application framework bring me bliss? I'm sure this is common type of project for many people. What solutions have you found to make this easy and portable from project to project? Bill Moseley mailto:[EMAIL PROTECTED]
Re: [request] modperl mailing lists searchable archives wanted
Hi Stas, I just updated the search site for Apache.org with a newer version of swish. The context highlighting is a bit silly, but that can be fixed. I'm only caching the first 15K of text from each page for context highlighting. http://search.apache.org It seems reasonably fast (it's not running under mod_perl currently, but could -- if mod_perl was in that server ;). It takes about eight or nine minutes to reindex ~35,000 docs on *.apache.org so the mod_perl list (and others) shouldn't too much trouble, I'd think, with smaller numbers and smaller content. It doesn't do incremental indexing at this point, which is a draw back, but indexing is so fast it normally doesn't matter (and there's an easy work-around for something like a mailing list to pickup new messages as they come in during the day). Swish-e can also call a perl program which feeds docs to swish. That makes it easy to parse the email into fields for something like: http://swish-e.org/Discussion/search/swish.cgi which looks a lot like the Apache search site... But, what would be needed is a good threaded mail archiver, which there are many to pick from, I'd expect. Some archives are browsable, but their search engines simply suck. e.g. marc.theaimsgroup.com I think is the only one that archives [EMAIL PROTECTED], but if you try to seach for perl string like APR::Table::FETCH it won't find anything. If you search for get_dir_config it will split it into 'get', 'dir', 'config' and give you a zillion matches when you know that there are just a few. On swish you could say : and _ are part of words and those would index as full words. Or, just simply search for phrase: get_dir_config and it would search for the phrase get dir config which would probably find what you want. Maybe : and _ are ok in words, but you have to think carefully about others. It's more flexible to split the words and use phrases in many cases. Bill Moseley mailto:[EMAIL PROTECTED]
Re: FW: FW: AuthCookie Woes!
At 07:59 AM 9/4/2001 -0400, Chris Lavin wrote: Im sorry I thought that everyone would be familiar with Appach::AuthCookie Perhaps if you posted a tiny httpd.conf and the url of where it's running. And I'd tend to use telnet for debugging and write log messages in Apache::AuthCookie where its setting the header, and so on. Bill Moseley mailto:[EMAIL PROTECTED]
Random requests in log file
Hi, We always see the normal probes for known insecure CGI scripts, and spiders keep our logs full. But lately there have been a huge number of requests for resources that are not on our server (even not counting Code Red II). It looks like someone is spidering another server, yet sending requests to our machine -- the requests don't really look like probes for insecure scripts, rather just for files that are not and never have been on this server (or any related virtual hosts). Does everyone else see these? What's the deal? Are they really probes or some spider run amok? Right now someone is looking for things like: /r/dr /r/g3 /r/sb /r/sw /r/s/2 /r/a/booth /r/s/pp /NowPlaying /mymovies/list /terms /ootw/1999/oarch99_index.html I currently have a killfile of IP addresses and a PerlInitHandler that blocks requests, but it would be nice to automate that process. Are there any current modules that do this? Another thing I find odd: this server has three virtual hosts. In the second and third VH's logs I find requests for files found on the first, default, VH. I've logged the Host: header and indeed it was there. Odd. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Random requests in log file
At 10:24 AM 08/07/01 -0700, Nick Tonkin wrote: /r/dr /r/g3 /r/sb www.yahoo.com/r/dr www.yahoo.com/r/sw Yes, and I have seen plenty of cases where broken web servers or web sites or web browsers screw up HREFs, by prepending an incorrect root uri to a relative link. That would be my guess, broken URLs somewhere out in space. But why the continued hits for the wrong pages? It's like someone spidered an entire site, and then has gone back and is now testing all those HREFs against our server. Currently mod_perl is generating a 404 page. When I block I return FORBIDDEN, but that doesn't seem to stop the requests either. They don't seem to get the message... And isn't it correct that if they request again before CLOSE_WAIT is up I'll need to spawn more servers? If they are not sending requests in parallel I wonder if it would be easier on my resources to really slow down responses as long as I don't tie up too many of my processes. If they ignore FORBIDDEN maybe they will see the timeouts. Time to look at the Throttle modules, I suppose. Bill Moseley mailto:[EMAIL PROTECTED]
Backing out a mod_perl install
I'm upgrading mod_perl on a Solaris 2.6 production machine. Although a little downtime on this machine won't be a big issue, I'm wondering about backup plans. I've got mod_perl ready for make install (I'm currently using a PERL5LIB environment to test mod_perl on a high port from the blib). So I was just going to bring down the server, make install, and then startup the new server. But, I'd like to be able to back out, just in case. I was thinking about tar'ing up the Apache name space, and Apache.pm to backout the Perl modules so I could run the old httpd, if needed. Is that a reasonable thing to do, and if so, is there anything else you would suggest? Thanks, Bill Moseley mailto:[EMAIL PROTECTED]
RE: Backing out a mod_perl install
At 03:21 PM 08/06/01 -0400, Geoffrey Young wrote: to backout the Perl modules so I could run the old httpd, if needed. you can try the tar_Apache and offsite_tar arguments to make and see if they wrap up everything you need... Ok, thanks tar_Apache should include all that I need, thanks. I don't see the need for offsite_tar in my case, since I already have mod_perl built and ready for install in the target machine. No need to run make install on the httpd side, right? I can just copy the httpd binary (I'll be using the same ServerRoot as the existing 1.3.12 server), so I'm assuming all my icons and mime.types files from 1.3.12 will be just fine. Bill Moseley mailto:[EMAIL PROTECTED]
PERL5LIB perl section
In a previous post today I mentioned how I was running mod_perl from the build directory by setting a PERL5LIB. I seem to need to add: perl /perl at the top of httpd.conf. Otherwise I get: Apache.pm version 1.27 required! /usr/local/lib/perl5/site_perl/5.005/sun4-solaris/Apache.pm is version 1.26 I use perl sections farther down in httpd.conf, but I seem to need it at the very top. If a PerlTaintCheck On comes before the perl/perl then I get that error. Why is that? Bill Moseley mailto:[EMAIL PROTECTED]
modules/status make test fail
I can come back with more, if needed, but just in case someone else has seen this: Fresh 1.26 and 1.3.20 Sun Solaris 2.6 Perl 5.005_03 I just did this on Linux and it worked just fine :( This doesn't bother me, too much modules/request.Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 149. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 149. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 149. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 147. Use of uninitialized value at modules/request.t line 149. skipping test on this platform modules/src.ok modules/ssi.ok modules/stage...skipping test on this platform But: modules/status..Internal Server Error dubious Test returned status 9 (wstat 2304, 0x900) DIED. FAILED tests 8-10 Failed 3/10 tests, 70.00% okay In error_log: [Fri Aug 3 16:27:16 2001] [error] Can't locate object method inh_tree via package Devel::Symdump at /data/_g/lii/apache/1.26/mod_perl-1.26/blib/lib/Apache/Status.pm line 222, fh1b chunk 1. Do I need an updated Devel::Symdump? Bill Moseley mailto:[EMAIL PROTECTED]
Re: OT: Re: ApacheCon Dublin Cancelled?
At 10:46 AM 07/16/01 -0600, Nathan Torkington wrote: Are there any requests other than price for next year? What would you like to see? What could you do without? Well, this is more along the price issue that you don't want to hear about, but I much prefer a single fee for everything instead of separate tutorial and conference fees. I understand the scheduling hell, but I like the flexibility to decide what to attend during the conference. What I attend in the morning may influence what I attend in the afternoon. And these days more and more people may find themselves like me, paying their own way. I'm very disappointed that I had to cancel after adding everything up. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Lightweight CGI.pm for PerlHandlers
At 11:28 PM 05/18/01 -0400, Neil Conway wrote: Is there a simple (fast, light) module that generates HTML in a similar style to CGI.pm, for use with mod_perl? Well, not in the style similar to CGI.pm, but how about a here doc -- if it's that simple. At the moment, I'd rather not move to a system like HTML::Mason or Template Toolkit -- the HTML I'm producing is very simple (in fact, I've just been using $r-print up to now, and it's not _too_ bad). I bounce between HTML::Template, Template-Toolkit, and here docs. Frankly, mostly I use HTML::Templace for even really simple things (I use a here doc for the template). This kind of thing is personal style, but I think the templates make the HTML stuff easier (and hidden away). One of my New Years Resolutions was to do everything in Template-Toolkit so I'd learn it better. (But I'm also not drinking less beer, windsurfing more, working less, or having more, eh, sleep, either.) Bill Moseley mailto:[EMAIL PROTECTED]
mod_parrot
I assume everyone saw this... ;) http://www.oreilly.com/parrot/ Bill Moseley mailto:[EMAIL PROTECTED]
Re: cgi_to_mod_perl manpage suggestion
At 03:34 PM 03/14/01 +0200, Issac Goldstand wrote: On Tue, 13 Mar 2001, Andrew Ho wrote: PHUm, you're getting me confused now, but PerlSendHeader On means that PHmod_perl WILL send headers. I still think that the above line is confusing: It is because mod_perl is not sending headers by itelf, but rather your script must provide the headers (to be returned by mod_perl). However, when you just say "mod_perl will send headers" it is misleading; it seems to indeicate that mod_perl will send "Content-Type: text/html\r\n\r\n" all by itself, and that conversely, to disable that PerlSendHeaders should be Off. Read it again. You are confusing "some" headers with "all" headers -- there's more than just Content-Type:. To me it doesn't sound at all like it will send content-type: PerlSendHeader On Now the response line and common headers will be sent as they are by mod_cgi. (response line and common headers != content type) And, just as with mod_cgi, PerlSendHeader will not send a terminating newline, your script must send that itself, e.g.: - print "Content-type: text/html\n\n"; All documentation has room for improvement, of course. It's confusing if you haven't written a mod_perl content handler before (or haven't read the perldoc Apache or the Eagle book) and don't know that you need $r-send_http_header under a normal content handler. And if you are like me, you have to read the docs a few times before it all makes sense. Also, note that Apache::Registry lets reasonably well written CGI scripts to run under both mod_cgi and Apache::Registry, which is what that man page is describing. It's not a CGI script if there's not a content-type: header sent. And the docs are not implying that you can turn on PerlSendHeader and then go through all your CGI scripts and remove the print "Content-type: text/html\n\n" lines. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [ANNOUNCE] Cache-Cache-0.03
At 01:03 PM 03/10/01 -0500, DeWitt Clinton wrote: Summary: Perl Cache is the successor to the popular File::Cache and IPC::Cache perl libraries. This project unifies those modules under the generic Cache::Cache interface and implements Cache::FileCache, Cache::MemoryCache, Cache::SharedMemoryCache, and Cache::SizeAwareFileCache. When you say successor to File::Cache does that means File::Cache will not be maintained as a separate module anymore? Have you though about making SharedMemoryCache flush to disk if it becomes full but before it's time to expire the data? Bill Moseley mailto:[EMAIL PROTECTED]
ApacheCon Early-Bird registration
Hi, Well, being on the West Coast I failed to realize that "ApacheCon Early-Bird registration ends TO-DAY!" was sometime in the afternoon before I got back on-line to find that announcement... Anyway, are there any other cheap^H^H^H^H^H poor contractors that would like to form a group and go for the group rate? Is there a BOF schedule yet? Bill Moseley mailto:[EMAIL PROTECTED]
Re: Authentication handlers
At 12:58 PM 03/03/01 +0530, Kiran Kumar.M wrote: >>>> hi , i'm using mod_perl authentication handler, where the user's credentials are checked against a database and in the database i have a flag which tells the login status (y|n), but aftr the user logs out the status is changed to n , my problem is that after logging out if the user goes one page back and submits the browser sends the username and password again , and the status is changed to y . Is there any means of removing the username and password from the browsers cache. I guess I don't understand you setup. If you have a database entry that says they are logged out why don't you see this when they send their request and return a "Sorry, logged out" page? I wouldn't count on doing anything on the client side. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Authentication handlers
At 10:11 AM 03/03/01 -0500, Pierre Phaneuf wrote: The problem here is that the first basic authentication is not any different from the next ones, so if he marks the user as logged out, going to an page requiring authentication will simply mark the user as logged in. That's what I was assuming. Basic authentication is annoying. They forgot to put a way to revoke the thing when they designed it. Eh, that's life... That's the real point. Sometimes you have to weigh the use of a always-on feature like basic authentication vs. maybe-on cookies. If you really must use basic authentication then besides the AUTH_REQUIRED trick, sometimes you can get clients to forget by sending them to a new URL with an embedded username and password that logs into the same AuthName but with a different username/password combination. But, you CAN'T count on anything working unless you know all your clients -- if even then. If your problem is that some clients don't use cookies, then perhaps Apache::AuthCookieURL might help. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Stop button (was: Re: General Question)
At 02:02 PM 02/26/01 +, Steve Hay wrote: I have a script which I wish to run under either mod_perl or CGI which does little more than display content and I would like it to stop when the user presses Stop, but I can't get it working. You need to do different things under mod_perl and mod_cgi. Refer to the Guide for running under mod_perl -- you probably should check explicitly for an aborted connection as the guide shows. [This is all from my memory, so I hope I have the details correct] Under mod_cgi Apache will receive the SIGPIPE when it tries to print to the socket. Since your CGI script is running as a subprocess (that has been marked "kill_after_timeout", I believe), apache will first close the pipe from your CGI program, send it a SIGTERM, wait three seconds, then send a SIGKILL, and then reap. This all happens in alloc.c, IIRC. This is basically the same thing that happens when you have a timeout. So, you can catch SIGTERM and then have three seconds to clean up. You won't see a SIGPIPE unless you try to print in that three second gap. Does it do the same thing under NT? Bill Moseley mailto:[EMAIL PROTECTED]
Stop button (was: Re: General Question)
At 08:43 AM 02/12/01 +0800, Stas Bekman wrote: What happens to the 54 earlier processes, since I submitted the request 55 times? How do Apache mod_perl handle the processes to nowhere? They get aborted the first moment they try to send some output (or read input if they didn't finish yet) and after that get aborted as they realize that the connection to the socket is dead. See: http://perl.apache.org/guide/debug.html#Handling_the_User_pressed_Stop_ I thought one has to explicitly check for the aborted connection in = 1.3.5 -- like you explain in the section in the Guide following the one you cited. Isn't $r-print a noop after an aborted connection? Which gives me a chance to ask an off topic question about this very topic: As this arrived at in my inbox I was debugging a pair of old CGI scripts. Both are very similar and use common modules for most everything -- one is the public script and the other is the admin script for maintenance and data entry.Both CGI. Both scripts use a common module for logging and DBM file opening and closing. Really, the only difference is the form processing for the two scripts. The DBM close is in an END block, and the END block also writes a "Transaction Done" message to STDERR (STDERR is dup'ed to a log file I use for locking before at the start of every request the DBM files -- very low traffic script). Hitting stop on the public script has no effect -- just like I'd expect for 1.3.12 -- it keeps running even when generating a long-ish output. But hitting stop on the admin script aborts every time. I have $SIG{PIPE} = sub { print STDERR "$$ caught sigpipe" exit } in the common module and it prints the error when I hit stop on the admin script. Nothing is noted in the apache logs about broken pipes. I'm scratching my head at this point. Any ideas what to look for? Bill Moseley mailto:[EMAIL PROTECTED]
Re: Stop button (was: Re: General Question)
I don't know why I have to learn this fresh again each time -- it appears I'm confusing mod_perl and mod_cgi. Let's see if I have this right. Under mod_perl and apache = 1.3.5 if the client drops the connection Apache will ignore it (well it might print an info message to the log file about "broken pipe"). This means a running mod_perl script will continue to run to completion, but the $r-prints go nowhere. The old Apache behavior of killing your running script can be restored using Apache::SIG -- which is something you would not want to use if you were doing anything besides displaying content, I'd think. $r-connection-aborted can be used to detect the aborted connection (as Stas shows in the Guide). That sounds like a better way to deal with broken connections. Does all that sound right? Are there still issues with doing this? local $SIG{PIPE} = sub { $aborted++ }; Then mod_cgi I'm still unclear on. The cgi application does receive the SIGPIPE... well it did 1/2 hour ago before I rebooted my machine. Now I can't seem to catch it. But, printing again after the SIGPIPE will kill the CGI script. Bill Moseley mailto:[EMAIL PROTECTED]
[OT] Freeing Memory with of C extensions under Solaris
Hi, Sorry for the OT, and I'm sure this is common knowledge. I'm using some C extensions in my mod_perl application. IIRC, memory used by perl is never given back to the system. Does this apply also to malloc() and free()ed memory in my C extension? Or is that OS dependent? Can Apache ever shrink? I ask because I'm using some C extensions that do a lot of memory allocating and freeing of somewhat large structures. I'm running under Linux and Solaris 2.6. Another memory issue: I'm using File::Cache with a cron job that runs every ten minutes to limit the amount of space used in /tmp/File::Cache -- it seemed better than using an Apache child to do that clean up work. But on Solaris /tmp is carved out of swap. Do you think it's risky to use /tmp this way (since full /tmp could use up all swap)? Thanks very much, Bill Moseley mailto:[EMAIL PROTECTED]
Re: File::Cache problem
At 11:56 AM 02/07/01 +0400, BeerBong wrote: And when cache size is exceeded all mod_perl processes are hanging. I had this happen to me a few days back on a test server. I thought I'd made a mistake by doing a rm -rf /tmp/File::Cache while the server was running (and while the File::Cache object was persistent). And another question (I'm use Linux Debian Potato)... Is there way to define params of the currently performing request You could use a USR2 hander in your code, if that's what you are asking. This is what my spinning httpd said the other day: [DIE] USR2 at /data/_g/lii/perl_lib/lib/5.00503/File/Spec/Unix.pm line 57 File::Spec::Unix::catdir('File::Spec', '') called at /data/_g/lii/perl_lib/lib/5.00503/File/Spec/Functions.pm line 41 File::Spec::Functions::__ANON__('') called at /data/_g/lii/perl_lib/lib/site_perl/5.005/File/Cache.pm line 862 File::Cache::_GET_PARENT_DIRECTORY('/') called at /data/_g/lii/perl_lib/lib/site_perl/5.005/File/Cache.pm line 962 But I haven't seen it happen since then. Bill Moseley mailto:[EMAIL PROTECTED]
Upgrading mod_perl on production machine (again)
This is a revisit of a question last September where I asked about upgrading mod_perl and Perl on a busy machine. IIRC, Greg, Stas, and Perrin offered suggestions such as installing from RPMs or tarballs, and using symlinks. The RPM/tarball option worries me a bit, since if I do forget a file, then I'll be down for a while, plus I don't have another machine of the same type where I can create the tarball. Sym-linking works great for moving my test application into live action, but it seems trickier to do this with the entire Perl tree. Here's the problem: this client only has this one machine, yet I need to setup a test copy of the application on the same machine running on a different port for the client and myself to test. And I'd like to know that when the test code gets moved live, that all the exact same code is running (modules and all). What to do in this situation? a) not worry about it, and just make install mod_perl and restart the server and hope all goes well? b) cp -rp /usr/local/lib/perl5 and use symlinks to move between the two? When ready to move, kill httpd, change the perl symlinks for the binary, perl lib, and httpd, and restart? c) setup a new set of perl, httpd, and my application and when ready to go live just change the port number? Or simply put - how would you do this: With one machine I want to upgrade perl to 5.6.0, upgrade your application code, new version of mod_perl, and allow for testing of the new setup for a few weeks, yet only require a few seconds of downtime to switch live (and back again if needed)? Then I wonder which CPAN module I'll forget to install... Bill Moseley mailto:[EMAIL PROTECTED]
Re: location not working
At 09:59 AM 01/10/01 +0100, [EMAIL PROTECTED] wrote: NEVERTHELESS, I get 404 when I enter http://myserver//hello/world and it is looking in the htdocs directory according to the error_log. Can you please post the entire error_log message. Bill Moseley mailto:[EMAIL PROTECTED]
Caching search results
I've got a mod_perl application that's using swish-e. A query from swish may return hundreds of results, but I only display them 20 at a time. There's currently no session control on this application, and so when the client asks for the next page (or to jump to page number 12, for example), I have to run the original query again, and then extract out just the results for the page the client wants to see. Seems like some basic design problems there. Anyway, I'd like to avoid the repeated queries in mod_perl, of course. So, in the sort term, I was thinking about caching search results (which is just a sorted list of file names) using a simple file-system db -- that is, (carefully) build file names out of the queries and writing them to some directory tree . Then I'd use cron to purge LRU files every so often. I think this approach will work fine and instead of a dbm or rdbms approach. So I asking for some advice: - Is there a better way to do this? - There was some discussion about performance and how many files to put in each directory in the past. Are there some commonly accepted numbers for this? - For file names does it make sense to use a MD5 hash of the query string? It would be nice to get an even distribution of files in each directory. - Can someone offern any help with the locking issues? I was hoping to avoid shared locking during reading -- but maybe I'm worrying too much about the time it takes to ask for a shared lock when reading. I could wait a second for the shared lock and if I don't' get it I'll run the query again. But it seems like if one process creates the file and begins to write without LOCK_EX and then gets blocked, then other processes might not see the entire file when reading. Would it be better to avoid the locks and instead use a temp file when creating and then do an (atomic?) rename? Thanks very much, Bill Moseley mailto:[EMAIL PROTECTED]
Re: searchable site
At 07:08 PM 01/01/01 -0800, Paul J. Lucas wrote: SWISH++ can run as a multi-threaded daemon that listens on either a Unix-domain or TCP socket, hence also without forking. Which I would guess seems like a better use of resources than placing the SWISH-E code in each httpd child. I was going to ask you why or what makes it "faster" and if that applies to SWISH-E 2.x, but that's a bit too off topic. Maybe in separate email. BTW: http://homepage.mac.com/pauljlucas/software/swish/man/ seems broken. Bill Moseley mailto:[EMAIL PROTECTED]
[OT] Anyone good with IPC?
Sorry for the way OT post, but this list seems to have the smartest, most experienced, most friendly perl programmers around -- and this question on other perl lists failed to get any bites. Would someone be willing to offer a bit of help off list? I'm trying to get two programs talking in an HTTP-like protocol through a unix pipe. I'm first trying to get it to work between two perl programs (below), but in the end, the "client" will be a C program (and that's a different nut to crack). The goal is to add a "filter" feature to the C program, where you register some external program (called a server, in this example, since it will be answering requests) and the C program starts the server, and then feeds requests over and over leaving the server in memory. A simple filter might be something that converts to lower case, or converts text dates to a timestamp. The C program (client) sends headers and some content, and the filter (server) returns headers and some content. But it's a "Keep Alive" connection, so another request can be sent without closing the pipe. This approach seems simple -- at least for someone writing the filter program. Just read and print (non-buffered). It's probably not very portable -- I'd expect to fail on Windows. (Are there better methods?) Anyway, this is the sample code I was trying, but was not getting anywhere. Seems like IO::Select::can_read() returns true and then I can read back the first header, but then can_read() never returns true again. I really need to be able to read and parse the headers, then read Content-Length: bytes since the content can be of varying length. cat client.pl #!/usr/local/bin/perl -w use strict; use IPC::Open2; use IO::Select; use IO::Handle; my ( $rh, $wh ); my $pid = open2($rh, $wh, './server.pl'); $pid || die "Failed to open"; my $read = IO::Select-new( $rh ); $rh-autoflush; $wh-autoflush; for (1..2) { print "\n$0: Sending Headers:$_\n"; print $wh "Header-number: $_\n", "Content-type: perl/test\n", "Header: test\n\n"; # Now read the response while ( 1 ) { my $fh; if ( ($fh) = $read-can_read(0) ) { print "Can read!\n"; my $buffer = $rh; #$fh-read( $buffer, 1024 ); last unless $buffer; print "$0: Read $buffer"; } else { print "Can't read sleeping...\n"; sleep 1; } } print "$0: All done!\n"; } lii@mardy:~ cat server.pl #!/usr/local/bin/perl -w use strict; $|=1; warn "In $0 pid=$$\n"; while (1) { my @headers = (); while ( ) { chomp; if ( $_ ) { warn "$0: Read '$_'\n"; push @headers, $_; } else { for ( @headers ) { warn "$0: Sending $_\n"; print $_,"\n"; } print "\n"; last; } } } Bill Moseley mailto:[EMAIL PROTECTED]
Re: Dynamic content that is static
At 09:08 PM 12/22/00 -0500, Philip Mak wrote: I realized something, though: Although the pages on my site are dynamically generated, they are really static. Their content doesn't change unless I change the files on the website. This doesn't really help with your ASP files, but have you looked at ttree in the Template Toolkit distribution? The problem, AFAIK, is that ttree only looks only at the top level documents and not included templates. I started to look at Template::Provider to see if there was an easy way to write out dependency information to a file, and then stat all those files every five minutes from a cron job and if anything changes, touch the top level files and then run ttree again. I'd like this because I'm generating cobarnded pages with mod_perl, and many of the pages are really static content. Bill Moseley mailto:[EMAIL PROTECTED]
Re: fork inherits socket connection
At 04:02 PM 12/15/00 +0100, Stas Bekman wrote: Am I missing something? You don't miss anything, the above code is an example of daemonization. You don't really need to call setsid() for a *forked* process that was started to execute something and quit. It's different if you call system() to spawn the process. But since it's a child of the parent httpd it's not a leader anyway so you don't need the extra fork I suppose. Am I correct? In fact it's good that you've posted this doc snippet, I'll use it as it's more complete and cleaner. Thanks. Thank goodness! I like this thread -- It's been hard keeping up with all the posts just to see if PHP or Java is better than mod_perl ;) Stas, will you please post your additions/notification about this when you are done? I do hope you go into a bit of detail on this, as I've posted questions about setsid a number of times and locations and I'm still unclear about when or when not to use it, why to use it, and how that might relate to mod_perl and why it makes a difference between system() vs. fork. I've just blindly followed perlipc's recommendations. BTW -- this isn't related to the infrequently reported problem of an Apache child that won't die even with kill -9, is it? Eagerly awaiting, Bill Moseley mailto:[EMAIL PROTECTED]
Re: fork inherits socket connection
At 04:02 PM 12/15/00 +0100, Stas Bekman wrote: open STDIN, '/dev/null' or die "Can't read /dev/null: $!"; open STDOUT, '/dev/null' or die "Can't write to /dev/null: $!"; defined(my $pid = fork) or die "Can't fork: $!"; exit if $pid; setsid or die "Can't start a new session: $!"; open STDERR, 'STDOUT' or die "Can't dup stdout: $!"; You don't miss anything, the above code is an example of daemonization. You don't really need to call setsid() for a *forked* process that was started to execute something and quit. Oh, so that's the difference between system and fork/exec! That's what I get for following perlipc instead of the Guide. I've always done it the Hard Way (tm) before. That is, in my mod_perl handler I would fork, then waitpid, call setsid, fork again freeing Apache to continue (and double fork to avoid zombies), and then finally exec my long running program. With this method I had to call setsid or else killing the Apache parent would kill the long running process. Calling system() in the handler and then doing a simple fork in the long running program is much cleaner (but you all knew that already). I just didn't realize that it freed me from calling setsid. I just have to remember not to system() something that doesn't fork or return right away. But, I'm really posting about the original problem of the socket bound the by the forked program. I tried looking through mod_cgi to see why mod_cgi can fork off a long running process that won't hold the socket open but I my poor reading of it didn't turn anything up. Anyone familiar enough with Apache (and/or mod_cgi) to explain the difference? Does mod_cgi explicitly close the socket file descriptor(s) before forking? Thanks, Bill Moseley mailto:[EMAIL PROTECTED]
Re: Script Debugging Modules?
At 08:04 PM 12/10/00 +0100, Stas Bekman wrote: use constant DEBUG = 0; ... warn "bar" if DEBUG; you can keep the debug statements in the code without having any overhead of doing if DEBUG, since they are stripped at compile time. Just how smart is the compiler? Maybe all these debugging options indicate there's too much stuff in this module, but... use constant DEBUG_TEMPLATE = 1; use constant DEBUG_SESSION = 2; use constant DEBUG_REQUEST = 4; use constant DEBUG_QUERY = 8; use constant DEBUG_QUERY_PARSED = 16; my $debug = DEBUG_REQUEST|DEBUG_QUERY; ... warn "query = '$query'\n" if $debug DEBUG_QUERY; Is the compiler that smart, or is there a better way such as use constant DEBUG_TEMPLATE = 0; # OFF use constant DEBUG_SESSION = 1; # ON use constant DEBUG_REQUEST = 0; use constant DEBUG_QUERY = 1; # ON use constant DEBUG_QUERY_PARSED = 0; warn $query if DEBUG_QUERY || DEBUG_QUERY_PARSED; Bill Moseley mailto:[EMAIL PROTECTED]
Sys::Signal Weirdness
This is slightly off topic, but my guess is Sys::Signal is mostly used by mod_perl people. Can someone else test this on their machine? I have this weird problem where I'm not catching $SIG{ALRM}. The test code below is a simple alarm handler that looks like this: eval { local $SIG{__DIE__}; if ( $timeout ) { my $h = Sys::Signal-set( ALRM = sub { die "Timeout after $timeout seconds\n" } ); warn "Set Signal $h\n"; alarm $timeout; } print "Test 1 Parent reading: $_" while FH; alarm 0 if $timeout; }; This isn't working -- but if I simply comment out the if ( $timeout ) block it works. Here's the output on my machine. perl test.pl Starting test2 - WITHOUT 'if ( $timeout )' Set Signal Sys::Signal=SCALAR(0x810d120) in child loop 12423 Test 2 Parent reading: 1 Test 2 Parent reading: 2 Test 2 Parent reading: 3 Timeout after 4 seconds --- good! Starting test1 - with 'if ( $timeout )' Set Signal Sys::Signal=SCALAR(0x810d12c) in child loop 12424 Test 1 Parent reading: 1 Test 1 Parent reading: 2 Test 1 Parent reading: 3 Alarm clock --- huh? Here's some cut-n-paste test code. This is what I get on Linux under 5.6. #!/usr/local/bin/perl -w use strict; test2(); test1(); $|= 1; use Sys::Signal; sub test1 { warn "\nStarting test1 - with 'if ( \$timeout )'\n"; my $child = open( FH, '-|' ); die unless defined $child; loop() unless $child; # not that this works my $timeout = 4; eval { local $SIG{__DIE__}; if ( $timeout ) { my $h = Sys::Signal-set( ALRM = sub { die "Timeout after $timeout seconds\n" } ); warn "Set Signal $h\n"; alarm $timeout; } print "Test 1 Parent reading: $_" while FH; alarm 0 if $timeout; }; if ( $@ ) { warn $@; kill( 'HUP', $child ); } } sub test2 { warn "\nStarting test2 - WITHOUT 'if ( \$timeout )'\n"; my $child = open( FH, '-|' ); die unless defined $child; loop() unless $child; my $timeout = 4; eval { local $SIG{__DIE__}; ### if ( $timeout ) { my $h = Sys::Signal-set( ALRM = sub { die "Timeout after $timeout seconds\n" } ); warn "Set Signal $h\n"; alarm $timeout; ### } print "Test 2 Parent reading: $_" while FH; alarm 0 if $timeout; }; if ( $@ ) { warn $@; kill( 'HUP', $child ); } } sub loop { $|=1; my $x; warn "in child loop $$\n"; sleep 1, ++$x, print "$x\n" while 1; } Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys::Signal Weirdness
At 04:42 PM 12/08/00 +0100, Stas Bekman wrote: Easy. Look at $h -- it's a lexically scoped variable, inside the block if($timeout){}. Of course when the block is over the setting disappears, when $h gets DESTROYed. Doh! I thought about that (which is why I was printing $h). I shouldn't debug before sunrise. Sure is nice to have you back, Stas! ;) Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Apache::SSI design questions
At 11:47 PM 11/28/00 -0600, Ken Williams wrote: 1) Is it preferred to use POSIX::strftime() for time formatting, or Date::Format::strftime()? One solution would be to dynamically load one or the other module according to which one is available, but I'd rather not do that. Hi Ken, Why not Apache::Util::ht_time()? Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: session expiration
At 03:00 PM 11/20/00 -0600, Trey Connell wrote: Is there anyway to know that a user has disconnected from their session through network failure, power off, or browser closure? How is that different from just going out for a cup of coffee or opening a new browser window and looking at a different site? I am logging information about the user to a database when they login to a site, and I need to clean up this data when they leave. Define "leave" and you will have the answer. All you can do is set an inactivity timeout, I'd suspect. cron is your friend in these cases. Cheers, Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: session expiration
At 05:20 PM 11/20/00 -0600, Trey Connell wrote: The latter will be accomplished with cookies and the first rule will be enforced with a "loggedin" flag in the database. My problem lies in the user not explicitly clicking logout when they leave the site. If they explicitly click logout, i can change the "loggedin" flag to false so that they can enter again the next time they try. However, if they do not explicitly logout, I cannot fire the code to change the flag in the database. That's where cron comes in. Just make your flag a time, and update it each request. cron then removes any that are older than some preset time and *poof* they are then logged out. They try to access again and you see they have a cookie, yet are logged out and you say "Sorry, you session has expired". So basically I want to set a cookie that will allow them to enter the site under their userid, but I can't allow them to enter if they are currently logged in from elsewhere. Why? What if they want two windows open at the same time? Is that allowed? That design limitation sounds like it's going to make trouble. Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[OT] Another *new* idea patented
I know this is way off topic, but I couldn't resist. Sorry if this is old news. First Amazon figures out that cookies could be used for, (who would have guessed?), maintaining state between sessions and patenting the concept. What a new idea! Now looking at eBay and I see that they have invented this thing called "thumbnails" that are miniature photos that you can, get this, click with your mouse! Not only that, they have figured out a way to transfer images from one computer to another via HTTP! Another brilliant invention that needs a patent. http://pages.ebay.com/help/basics/g-gallery.html "Gallery Our patent pending Gallery () is a new way of browsing items for sale at eBay. The Gallery presents miniature pictures, called thumbnails, for all of the items sellers have supplied pictures for in JPG format." US Patent 6,058,417 http://164.195.100.11/netahtml/srchnum.htm Can Randal still give his "Mod_perl Enabled Thumbnail Picture Server" talk? BTW -- For fun go to Ebay and do a search for any auction, then look closely at the HTTP headers IIS is spitting out when requesting any of their auctions. You can imagine the fun in trying to explain (over three emails now) the problem to their customer support. Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
porting mod_perl content handler to CGI
Howdy, I have an application that's pure mod_perl -- its modules use the request object to do a few bits of work like reading parameters, query string, specialized logging, dealing with NOT_MODIFIED, and so on. Normal stuff provided by the methods of Apache, Apache::Util, Apache::URI and Apache::Connection. Now, I'd like to use a few of my modules under CGI -- for an administration part of the application that's bigger and not used enough to use up space in the mod_perl server. But it would be nice to have a common code base. So, I'm writing a replacement module of those classes and supporting just the few methods I need. I'm using CGI.pm, URI, HTTP::Date and so on to handle those few mod_perl methods I'm using in my modules. For example, I have a function that does specialized logging that I want to use both under mod_perl and CGI. So, this would work under CGI my $r = Apache-request; my $remote = $r-connection-remote_ip; where in the replacement package Apache::Connection: sub remote_ip { $ENV{REMOTE_ADDR} } Before I spend much time, has this already been done? Might be kind of cool if one could get new CGI programmers to write all their CGI applications like mod_perl handlers -- could run as CGI on other servers, but when they want speed they are ready to use mod_perl. Anyway, does a mod_perl emulator for CGI exist? Bill Moseley mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [RFC] Apache::Expires
At 10:26 AM 11/15/00 -0500, Geoffrey Young wrote: hi all... I was wondering if anyone has some experience with expire headers for dynamic documents - kinda like mod_expires but for dynamic stuff. Geoff, Are you thinking about client/browsers or proxy caching with regard to this? Or does it matter? I currently use Last-modified and Content-length headers in my dynamic content that doesn't change much, but I've never considered using Expires, but maybe it's because I'm not fully up on what help Expires does. I have assumed that most browsers cache my documents and don't re-request them in their current session, so am I correct that Expires would only help for cases of browsers/clients that return to the page sometime in the future yet before the document Expires, and after closing their browser? I wonder how to determine how many requests that would save. Also, if a cached document is past its Expired time, does that force the client to get a new document, or can it still use If-Modified-Since? mod_expires indicates that a new document must be loaded, but RFC 2616 indicates that it can use If-Modified-Since (who know what the clients will do). I should know this too, but what effect does the presence of a query string in the URL have on this? Bill Moseley mailto:[EMAIL PROTECTED]
Microperl
This is probably more of a Friday topic: Simon Cozens discusses "Microperl" in the current The Perl Journal. I don't build mod_rewrite into a mod_perl Apache as I like rewriting with mod_perl much better. But it doesn't make much sense to go that route for a light-weight front-end to heavy mod_perl backend servers, of course. I don't have any experience embedding perl in things like Apache other that typing "perl Makefile.PL make", but Simon's article did make me wonder. So I'm curious from you that understand this stuff better: Could a microperl/miniperl be embedded in Apache and end up with a reasonably light-weight perl enabled Apache? I understand you would not have Dynaloader support, but it might be nice for simple rewriting. Curiously yours, Bill Moseley mailto:[EMAIL PROTECTED]
Re: AuthCookie solution
At 04:19 PM 11/15/00 -0500, Charles Day wrote: # We added the line below to AuthCookie.pm $r-header_out("Location" = $args{'destination'}.$args{'args'}); Why pass a new argument? Can't you just add the query string onto the destination field in your login.pl script? Something like the untested: my $uri = $r-prev-uri; my $query = $r-prev-args; $uri = "$uri?$query" if $query; print qq[INPUT TYPE=hidden NAME=destination VALUE="$uri"]; Bill Moseley mailto:[EMAIL PROTECTED]
Re: Microperl
At 07:38 PM 11/15/00 -0600, Les Mikesell wrote: - Original Message - From: "Bill Moseley" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, November 15, 2000 12:30 PM Subject: Microperl I don't build mod_rewrite into a mod_perl Apache as I like rewriting with mod_perl much better. But it doesn't make much sense to go that route for a light-weight front-end to heavy mod_perl backend servers, of course. Just curious: what don't you like about mod_rewrite? You ask that on the mod_perl list? ;) It's not perl, of course. I like those perl sections a lot. Oh, there were the weird segfaults that I had for months and months. http://www.geocrawler.com/archives/3/182/2000/10/0/4480696/ Nothing against mod_rewrite -- I was just wondering if a small perl could be embedded with out bloating the server too much. Bill Moseley mailto:[EMAIL PROTECTED]
RE: [ANNOUNCE] ApacheCon USA 2001: Call For Papers
At 04:08 PM 11/14/00 +0100, Stas Bekman wrote: Remember that your talk can be reused for both ApacheCon and TPC, most of the people don't make it to the both conferences. So while you are thinking about your TPC submission, at the same moment you can submit it to ApacheCon as well. For someone on a budget and no boss to pay my way, which conference will have more mod_perl? And for my 2 cents, I'd be interested in hearing about mod_perl and designing for scalability, whatever that means. Or was that the mod_backhand talk I missed? Bill Moseley mailto:[EMAIL PROTECTED]
Re: Fast DB access
At 09:20 PM 11/09/00 +, Tim Bunce wrote: On Thu, Nov 09, 2000 at 08:27:29PM +, Matt Sergeant wrote: On Thu, 9 Nov 2000, Ask Bjoern Hansen wrote: If you're always looking stuff up on simple ID numbers and "stuff" is a very simple data structure, then I doubt any DBMS can beat open D, "/data/1/12/123456" or ... from a fast local filesystem. Note that Theo Schlossnagel was saying over lunch at ApacheCon that if your filename has more than 8 characters on Linux (ext2fs) it skips from a hashed algorithm to a linear algorithm (or something to that affect). So go careful there. I don't have more details or a URL for any information on this though. Similarly on Solaris (and perhaps most SysV derivatives) path component names longer than 16 chars (configurable) don't go into the inode lookup cache and so require a filesystem directory lookup. Ok, possibly 8 chars in Linux and 16 under Solaris. Anything else to consider regrading the maximum number of files in a given directory? How about issues regarding file size? If you had larger files/records would DBM or RDBMS provider larger cache sizes? Bill Moseley mailto:[EMAIL PROTECTED]
Re: pre-loaded modules on Solaris
At 06:11 AM 11/10/00 -0500, barries wrote: Address Kbytes Resident Shared Private -- -- --- total Kb 24720 227203288 19432 pre-loaded modules total Kb 14592 1297630969880 not pre-loaed modules. Stupid question, probably, but when running the non-pre-loaded version, are you sure all the same modules are being loaded? Yes. According to perl-status, anyway. Some modules are loaded into the parent, of course, because of mod_perl. But when not pre-loading I start the server, look at perl-status and then made some requests and looked again to see what was loaded. The difference is what modules I'm use'ing in my test. I'm wondering if there's some set of modules that, for some reason, isn't being loaded by the sequence of requests you're firing against all of your httpds to get the servers "warmed up" to represent real-life state. When looking at pmap it looks like the main difference in "private" memory usage is in the heap. I'm not clear why the heap would end up so much bigger when pre-loading modules. Unfortunately, Linux doesn't seem to have the same reporting abilities as Solaris, but using /proc/pid/statm to show shared and private memory under these same test showed that pre-loading was a big win. So it seems like a Solaris issue. Bill Moseley mailto:[EMAIL PROTECTED]
Re: Dealing with spiders
At 03:29 PM 11/10/00 +0100, Marko van der Puil wrote: What we could do as a community is create spiderlawenforcement.org, a centralized database where we keep track of spiders and how they index our sites. It's an issue weekly, but hasn't become that much of a problem yet. The bad spiders could just change IPs and user agent strings, too. Yesterday I had 12,000 requests from a spider, but the spider added a slash to the end of every query string so over 11,000 were invalid requests -- but the Apache log showed the requests as being a 200 (only the application knew it was a bad request). At this point, I'd just like to figure out how to detect them programmatically. It seems easy to spot them as a human looking through the logs, but less so with a program. Some spiders fake the user agent. It probably makes sense to run a cron job every few minutes to scan the logs and write out a file of bad IP numbers, and use mod_perl to the list of IPs to block every 100 requests or so. I could look for lots of requests from the same IP with a really high relation of bad requests to good. But I'm sure it wouldn't be long before an AOL proxy got blocked. Again, the hard part is finding a good way to detect them... And in my experience blocking doesn't always mean the requests from that spider stop coming ;) Bill Moseley mailto:[EMAIL PROTECTED]
Re: pre-loaded modules on Solaris
Hi Mike, I've cc'd the mod_perl list for other Solaris users to consider. At 10:49 AM 11/01/00 -0800, Michael Blakeley wrote: I saw a significant benefit from pre-loading modules. Let's take a test case where we share via startup.pl: Without preloading: # tellshared.pl web count vsz rss kB % 8 146304 67784 78520 54 With the "use" section above: # tellshared.pl web count vsz rss kB % 8 132672 17032 115640 87 'rss' is the resident set size - that is, the amount of actual RAM in use. The 'vsz' tells the size of the virtual address space - the swap in use. The kB column shows the difference (ie, the "saved RAM" via page sharing) and the % shows the %-shared. I'm not clear you can measure the shared memory space that way. I don't really understand the memory system much, but here's a paper that I found interesting: http://www.sun.com/solutions/third-party/global/SAS/pdf/vmsizing.pdf "The virtual address size of a process often bares no resemblance to the amount of memory a process is using because it contains all of the unallocated memory, libraries, shared memory and sometimes hardware devices (in the case of XSun). "The RSS figure is a measure of the amount of physical memory mapped into a process, but often there is more than one copy of the process running, and a large proportion of a process is shared with another. "MemTool provides a mechanism for getting a detailed look at a processes memory utilization. MemTool can show how much memory is in-core, how much of that is shared, and hence how much private memory a process has. The pmem command (or /usr/proc/bin/pmap -x in Solaris 2.6) can be used to show the memory utilization of a single process." Now, with a simple "hello world" mod_perl handler that loaded a bunch of modules I did see that pre-loading modules reduces memory usage -- both in looking at ps output, and with the pmap program, even after a number of requests. This is consistent with what you commented on above. I'm repeating, but I found with a real-world application that sharing the modules ended up using quite a bit more "private" memory. I don't know if that's only an issue with my specific OS, or with how my specific application is running. Here's ps output with pre-loaded modules. On these tests I'm running Apache 1.3.12 with mod_perl 1.24 static, but everything else is DSO. I've got maxclients set to one so there's only the parent and one child. I'm pre-loading modules in a perl section here: S USER PID PPID %CPU %MEM VSZ RSSSTIMETIME COMMAND S lii 318 1 0.0 0.3 8376 5464 15:50:380:00 httpd.mo S lii 319 318 0.8 1.1 24720 21288 15:50:380:05 httpd.mo And now without pre-loaded modules: S USER PID PPID %CPU %MEM VSZ RSSSTIMETIME COMMAND S lii 1260 1 0.0 0.2 4392 3552 15:56:250:00 httpd.mo S lii 1261 1260 0.9 0.6 14592 12304 15:56:250:05 httpd.mo And here's comparing the totals returned by the pmap program that should detail shared and private memory (according to the paper cited above). Address Kbytes Resident Shared Private -- -- --- total Kb 24720 227203288 19432 pre-loaded modules total Kb 14592 1297630969880 not pre-loaed modules. Indeed there's a tiny bit more shared memory in the pre-loaded Apache, but the amount of "private" memory is significantly higher, too. Ten megs a child will add up. It doesn't really make sense to me, but that's what pmap is showing. Maybe this isn't that interesting. Anyway, I'll try a non DSO Apache and see if it makes a difference, and also try with an Apache that forks off more clients than just one, but I can't imagine that making a difference. Later, Bill Moseley mailto:[EMAIL PROTECTED]
Dealing with spiders
This is slightly OT, but any solution I use will be mod_perl, of course. I'm wondering how people deal with spiders. I don't mind being spidered as long as it's a well behaved spider and follows robots.txt. And at this point I'm not concerned with the load spiders put on the server (and I know there are modules for dealing with load issues). But it's amazing how many are just lame in that they take perfectly good HREF tags and mess them up in the request. For example, every day I see many requests from Novell's BorderManager where they forgot to convert HTML entities in HREFs before making the request. Here's another example: 64.3.57.99 - "-" [04/Nov/2000:04:36:22 -0800] "GET /../../../ HTTP/1.0" 400 265 "-" "Microsoft Internet Explorer/4.40.426 (Windows 95)" 5740 In the last day that IP has requested about 10,000 documents. Over half were 404 requests where some 404s were non-converted entities from HREFs, but most were just for documents that do not and have never existed on this site. Almost 1000 request were 400s (Bad Request like the example above). And I'd guess that's not really the correct user agent, either In general, what I'm interested in stopping are the thousands of requests for documents that just don't exist on the site. And to simply block the lame ones, since they are, well, lame. Anyway, what do you do with spiders like this, if anything? Is it even an issue that you deal with? Do you use any automated methods to detect spiders, and perhaps block the lame ones? I wouldn't want to track every IP, but seems like I could do well just looking at IPs that have a high proportion of 404s to 200 and 304s and have been requesting over a long period of time, or very frequently. The reason I'm asking is that I was asked about all the 404s in the web usage reports. I know I could post-process the logs before running the web reports, but it would be much more fun to use mod_perl to catch and block them on the fly. BTW -- I have blocked spiders on the fly before -- I used to have a decoy in robots.txt that, if followed, would add that IP to the blocked list. It was interesting to see one spider get caught by that trick because it took thousands and thousands of 403 errors before that spider got a clue that it was blocked on every request. Thanks, Bill Moseley mailto:[EMAIL PROTECTED]