Hi all.
(sorry for unfinished letter I sent before)
Can anybody explain historical reasons of ModPerl::RegistryCooker
default_handler() implementation?
We can see the following piece of code there:
# handlers shouldn't set $r->status but return it, so we reset the
# status after running it
my $old_status = $self->{REQ}->status;
my $rc = $self->run;
my $new_status = $self->{REQ}->status($old_status);
return ($rc == Apache2::Const::OK && $old_status != $new_status)
? $new_status
: $rc;
The goal of ModPerl modules is "Run unaltered CGI scripts under mod_perl".
If scripts are unaltered, how they can change $r->status? I think, answer
should be: 'in no way'.
Why we need to reset status then? I see no reasons. But this solution makes
us impossible to set
r->status from 'altered' scripts correctly and produces strange effects by
headers parser on
responses > 8k size.
Did anyone can explain, why this code appear and for which scenarios it come?
Apache's mod_cgi.c module returns OK regardless of the r->status value, which
was set from script output.
Why ModPerl handler should behave differently?
One good example about problems, introduced by this solution, is 404 status.
Let`s look into small script:
#!/usr/bin/perl -w
use strict;
use CGI;
my $q = CGI->new;
#### Variant 1
print $q->header(-charset => "windows-1251", -type => "text/html",
-status=>'404 Not Found');
print "SMALL RESPONSE";
#### Variant 2
#print $q->header(-charset => "windows-1251", -type => "text/html",
-status=>'404 Not Found');
#print 'BIG RESPONSE:' . '*' x 8192;
Under CGI it prints only "SMALL/BIG RESPONSE" strings with 404 status code
into browser.
Under mod_perl, browser get status 200 instead of 404 and script response
content is appended by
default Apache error-handler content (like 'Status: OK \n /path/to/script.pl
was not found on this
server') for 'small response'.
If we print 'big response', then browser get status 404 and apache
error-handler appended content
changes to 'Not Found \n The requested URL /perl/test.pl was not found on
this server.'.
The reasons for such behavior is what script uses CGI.pm which detects
mod_perl.
So, these scripts are practically 'altered' and can set $r-status inside of
them. After script
executes, we have r->status == 404. But ModPerl::RegistryCooker set r->status
to 200 back and return
404 as handler status - so apache ErrorDocument directive begin to work.
(under mod_cgi handler status is OK, so no error processing occurs).
Also, due to r->status changed, we can get 200 response code in browser if
response was not send
yet.
Most interesting thing, what we get 404 status logged into Apache access log
anyway, so it is harder to
detect this problem.
Ok, maybe this is completely CGI.pm problem? Let's look into next example -
fully unaltered script:
#!/usr/bin/perl -w
use strict;
print "Status: 404 Not Found\n";
print "Content-Type: text/html; charset=windows-1251\n\n";
#Small response
print 'NOTFOUND:' . '*' x 81;
#Big response
#print 'NOTFOUND:' . '*' x 8192;
When response is small, then 8k buffer is not filled, and r->status does not
changed internally
before cgi headers parsed on buffer flush. So, we get r->status == 200, and
handler status is 200
too. After headers are parsed, r->status become 404, but mod_perl handler
returns status 200 to
apache.
When response is big, then 8k buffer is filled and flushed before handler is
done.
So, we have r->status == 404, and handler status is 404 too. As result, we
have ErrorDocument
directive again.
Browser get 404 status in all these cases, I don't understand completely, how
this is processed.
If we enable mod_deflate apache module, all things become ever more
interesting.
When small response - all shows as before.
When big response - only apache error page displayed, no content from script
output displayed.
With mod_cgi we see output from script all the time, with or without
mod_deflate.
So, ModPerl::* modules does not fully reach their goal of "Run unaltered CGI
scripts under
mod_perl".
As follows from my analysis of mod_perl, mod_cgi and other httpd core, we
need no checks for new
r->status value after script run at all. Due to mod_perl handler status not
in (OK,DECLINED,DONE)
for all r->statuses != 200, ap_die() is always called instead of
ap_finalize_request_protocol() at
ap_process_request(). Then all responses with non-200 statuses are processed
through
ErrorDocument, if exists for that code, and then ap_send_error_response() is
called for them. This
is wrong way, I think. 'Altered' scripts have no practical use for
$r->status($newValue) in
current code state.
So, I propose to patch ModPerl::* modules and they would run
unaltered/altered scripts more
correctly:
- # handlers shouldn't set $r->status but return it, so we reset the
- # status after running it
- my $old_status = $self->{REQ}->status;
- my $rc = $self->run;
- my $new_status = $self->{REQ}->status($old_status);
- return ($rc == Apache2::Const::OK && $old_status != $new_status)
- ? $new_status
- : $rc;
+ return $self->run;
If altered script needs to set new handler status by doing 'return
$newHandleStatus', sub run() should
be altered to get value, returned from eval {}; But I think, all who needs
this wrote
their module handlers already ;-)
Thanks for your attention.
I understand what it is too hard to make such global changes and there is
not so much people
interested in this. So, please look to my letter as a question about help in
historical research ;-)
Let me remind you my question:
Did anyone can explain, why this code appears and for which scenarios it
come?
If anybody is interested, I propose to discuss this.
Thanks.
--
Regards,
Pavel mailto:[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]