Hi all. (sorry for unfinished letter I sent before)
Can anybody explain historical reasons of ModPerl::RegistryCooker default_handler() implementation? We can see the following piece of code there: # handlers shouldn't set $r->status but return it, so we reset the # status after running it my $old_status = $self->{REQ}->status; my $rc = $self->run; my $new_status = $self->{REQ}->status($old_status); return ($rc == Apache2::Const::OK && $old_status != $new_status) ? $new_status : $rc; The goal of ModPerl modules is "Run unaltered CGI scripts under mod_perl". If scripts are unaltered, how they can change $r->status? I think, answer should be: 'in no way'. Why we need to reset status then? I see no reasons. But this solution makes us impossible to set r->status from 'altered' scripts correctly and produces strange effects by headers parser on responses > 8k size. Did anyone can explain, why this code appear and for which scenarios it come? Apache's mod_cgi.c module returns OK regardless of the r->status value, which was set from script output. Why ModPerl handler should behave differently? One good example about problems, introduced by this solution, is 404 status. Let`s look into small script: #!/usr/bin/perl -w use strict; use CGI; my $q = CGI->new; #### Variant 1 print $q->header(-charset => "windows-1251", -type => "text/html", -status=>'404 Not Found'); print "SMALL RESPONSE"; #### Variant 2 #print $q->header(-charset => "windows-1251", -type => "text/html", -status=>'404 Not Found'); #print 'BIG RESPONSE:' . '*' x 8192; Under CGI it prints only "SMALL/BIG RESPONSE" strings with 404 status code into browser. Under mod_perl, browser get status 200 instead of 404 and script response content is appended by default Apache error-handler content (like 'Status: OK \n /path/to/script.pl was not found on this server') for 'small response'. If we print 'big response', then browser get status 404 and apache error-handler appended content changes to 'Not Found \n The requested URL /perl/test.pl was not found on this server.'. The reasons for such behavior is what script uses CGI.pm which detects mod_perl. So, these scripts are practically 'altered' and can set $r-status inside of them. After script executes, we have r->status == 404. But ModPerl::RegistryCooker set r->status to 200 back and return 404 as handler status - so apache ErrorDocument directive begin to work. (under mod_cgi handler status is OK, so no error processing occurs). Also, due to r->status changed, we can get 200 response code in browser if response was not send yet. Most interesting thing, what we get 404 status logged into Apache access log anyway, so it is harder to detect this problem. Ok, maybe this is completely CGI.pm problem? Let's look into next example - fully unaltered script: #!/usr/bin/perl -w use strict; print "Status: 404 Not Found\n"; print "Content-Type: text/html; charset=windows-1251\n\n"; #Small response print 'NOTFOUND:' . '*' x 81; #Big response #print 'NOTFOUND:' . '*' x 8192; When response is small, then 8k buffer is not filled, and r->status does not changed internally before cgi headers parsed on buffer flush. So, we get r->status == 200, and handler status is 200 too. After headers are parsed, r->status become 404, but mod_perl handler returns status 200 to apache. When response is big, then 8k buffer is filled and flushed before handler is done. So, we have r->status == 404, and handler status is 404 too. As result, we have ErrorDocument directive again. Browser get 404 status in all these cases, I don't understand completely, how this is processed. If we enable mod_deflate apache module, all things become ever more interesting. When small response - all shows as before. When big response - only apache error page displayed, no content from script output displayed. With mod_cgi we see output from script all the time, with or without mod_deflate. So, ModPerl::* modules does not fully reach their goal of "Run unaltered CGI scripts under mod_perl". As follows from my analysis of mod_perl, mod_cgi and other httpd core, we need no checks for new r->status value after script run at all. Due to mod_perl handler status not in (OK,DECLINED,DONE) for all r->statuses != 200, ap_die() is always called instead of ap_finalize_request_protocol() at ap_process_request(). Then all responses with non-200 statuses are processed through ErrorDocument, if exists for that code, and then ap_send_error_response() is called for them. This is wrong way, I think. 'Altered' scripts have no practical use for $r->status($newValue) in current code state. So, I propose to patch ModPerl::* modules and they would run unaltered/altered scripts more correctly: - # handlers shouldn't set $r->status but return it, so we reset the - # status after running it - my $old_status = $self->{REQ}->status; - my $rc = $self->run; - my $new_status = $self->{REQ}->status($old_status); - return ($rc == Apache2::Const::OK && $old_status != $new_status) - ? $new_status - : $rc; + return $self->run; If altered script needs to set new handler status by doing 'return $newHandleStatus', sub run() should be altered to get value, returned from eval {}; But I think, all who needs this wrote their module handlers already ;-) Thanks for your attention. I understand what it is too hard to make such global changes and there is not so much people interested in this. So, please look to my letter as a question about help in historical research ;-) Let me remind you my question: Did anyone can explain, why this code appears and for which scenarios it come? If anybody is interested, I propose to discuss this. Thanks. -- Regards, Pavel mailto:pavel2...@ngs.ru --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@perl.apache.org For additional commands, e-mail: dev-h...@perl.apache.org