Hello,
   I'm fairly new to using mod_perl. I've been able to find lots of 
resources dealing with mod_perl 1.x, but the documentation for 2.0 is 
rather sparse.

I'm pretty sure what I need to do can only be handled by Apache 2.0 & thus 
I'm forced to use mod_perl 2.0... (well 1.99)

I'm trying to proxy ServerB through ServerA... ok that's simple enough with 
mod_proxy. However, links, embedded images, etc in the proxied document end 
up broken if they are non-relative links (ie. start with a slash).

Example: on ServerB is a document say: /sales/products.html
in products.html it links to /images/logo.gif
accessing /sales/products.html using ServerB everything is fine. But, if I 
want to proxy ServerB via ServerA... say
ProxyPass /EXTERNAL http://ServerB

If I goto http://ServerA/EXTERNAL/sales/products.html the embedded image 
/images/logo.gif is requested from ServerA.

So to handle this I wanted to write a filer for ServerA to parse all pages 
served via Location /EXTERNAL and "fix" the links.

I wrote a handler (see below) using HTML::Parser to extract the tags that 
would contain links and process them.

It works great for the most part... however, it seems like instead of 
ServerA getting the entire output from ServerB, it gets it in 
chunks   which get processed individually. This causes my handler to fail 
when a tag is split between 2 chunks.

What I think needs to be done is to build up the document in a variable 
$html .= $buffer; and then call the $p->$parse($html) once the entire 
document has been received by ServerA (or maybe as simple of only calling 
$p->eof; at that point).

Or is there a better way to do this? One problem I've found so far is I 
need to fix style sheets, but I can probably write a special handler for 
them once I get this problem fixed.

Thanks!

######################################################
package RewriteLinks;

use strict;

use Apache::Filter;
use Apache::RequestUtil;
use APR::Table;
use HTML::Parser;

my %ReplaceAttrs = ( a     => 'href',
                      img   => 'src',
                      link  => 'href',
                      td    => 'background',
                      form  => 'action'
                    );
my $filter;

sub handler {
   $filter = shift;

### Create parser object ###
my $p = HTML::Parser->new( api_version => 3 );
    $p->handler(start   => \&do_tags, 'tagname, attr, text' );
    $p->handler(default => \&default, 'text');

   while ($filter->read(my $buffer, 32678)) {
     $p->parse($buffer);
   }

    $p->eof;                 # signal end of document

   1;
}

sub do_tags {
   my ($tagname, $attr, $text) = @_;

   ## only need to modify tags with url-like attributes starting with a slash
   if ($$attr{$ReplaceAttrs{$tagname}} =~ m|^/|) {
     my $TAG = "<" . uc($tagname);
     foreach my $key (keys %$attr) {
       $TAG .= ' ' . uc($key) . '="';
       if ($key eq $ReplaceAttrs{$tagname}) {
         $TAG .= '/EXTERNAL';
       }
       $TAG .= $$attr{$key} . '"';
     }
     $TAG .= ">\n";
     $filter->print($TAG);
   } else {
     $filter->print($text);
   }

}

sub default {
   my ($text) = @_;
   $filter->print($text);
}

1;




Reply via email to