Le vendredi 15 juillet 2005 à 00:43, Peter Stevens écrivait:
> 
> Q: When I request a page, it works with my browser, but when I use mech, 
> it doesn't. Why not?
> 

A more complicated setup includes using HTTP::Proxy and pointing both
WWW::Mechanize and any browser to a proxy that logs the interesting
bits of the transaction.

An example of such a proxy is included in the HTTP::Proxy distribution
as eg/logger.pl. The version currently on CPAN is outdated, and the
attached one should replace it in the next version (it works with the
version of HTTP::Proxy currently on CPAN).

Here an example of what it finds when you point your browser to
www.google.com when your IP is in France:

$ ./logger.pl peek 'google\.\w+$'

GET http://www.google.com/
302 Found
    Content-Type: text/html
    Set-Cookie: 
PREF=ID=50559ac18bae0f57:CR=1:TM=1119132978:LM=1119132978:S=Px8CAVLCC5FoR1NK; 
expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
    Location: 
http://www.google.fr/cxfer?c=PREF%3D:TM%3D1119132978:S%3Dwpjw70CuTrboKsrd&prev=/

GET 
http://www.google.fr/cxfer?c=PREF%3D:TM%3D1119132978:S%3Dwpjw70CuTrboKsrd&prev=/
302 Found
    Content-Type: text/html
    Set-Cookie: 
PREF=ID=e2b4582bd0c2849e:LD=fr:TM=1119132978:LM=1119132978:S=keTI_KO9ZyhHypD3; 
expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.fr
    Location: http://www.google.fr/

GET http://www.google.fr/
    Cookie: 
PREF=ID=e2b4582bd0c2849e:LD=fr:TM=1119132978:LM=1119132978:S=keTI_KO9ZyhHypD3
200 OK
    Content-Type: text/html

This shows how google tries to give you the same exact cookie on all
their local sites.

You can ask to see more headers than the predefined ones by using the
command-line parameter "header". The proxy only reports requests for
data with a text/* Content-Type.

-- 
 Philippe "BooK" Bruhat

 A wish is only as good as the wisher and what he can achieve.
                                    (Moral from Groo The Wanderer #35 (Epic))
#!/usr/bin/perl -w
use strict;
use HTTP::Proxy;
use HTTP::Proxy::HeaderFilter::simple;
use HTTP::Proxy::BodyFilter::simple;
use CGI::Util qw( unescape );

# get the command-line parameters
my %args = (
   peek    => [],
   header  => [],
);
{
    my $args = '(' . join( '|', keys %args ) . ')';
    for ( my $i = 0 ; $i < @ARGV ; $i += 2 ) {
        if ( $ARGV[$i] =~ /$args/o ) {
            push @{ $args{$1} }, $ARGV[ $i + 1 ];
            splice( @ARGV, $i, 2 );
            redo;
        }
    }
}

# the headers we want to see
my @srv_hdr = (
    qw( Content-Type Set-Cookie Set-Cookie2 WWW-Authenticate Location ),
    @{ $args{header} }
);
my @clt_hdr =
  ( qw( Cookie Cookie2 Referer Referrer Authorization ), @{ $args{header} } );

# NOTE: Body request filters always receive the request body in one pass
my $post_filter = HTTP::Proxy::BodyFilter::simple->new(
    sub {
        my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
        print STDOUT "\n", $message->method, " ", $message->uri, "\n";
        print_headers( $message, @clt_hdr );

        # this is from CGI.pm, method parse_params()
        my (@pairs) = split( /[&;]/, $$dataref );
        for (@pairs) {
            my ( $param, $value ) = split( '=', $_, 2 );
            $param = unescape($param);
            $value = unescape($value);
            printf STDOUT "    %-20s => %s\n", $param, $value;
        }
    }
);

my $get_filter = HTTP::Proxy::HeaderFilter::simple->new(
    sub {
        my ( $self, $headers, $message ) = @_;
        my $req = $message->request;
        if ( $req->method ne 'POST' ) {
            print STDOUT "\n", $req->method, " ", $req->uri, "\n";
            print_headers( $req, @clt_hdr );
        }
        print STDOUT $message->status_line, "\n";
        print_headers( $message, @srv_hdr );
    }
);

sub print_headers {
    my $message = shift;
    for my $h (@_) {
        if ( $message->header($h) ) {
            print STDOUT "    $h: $_\n" for ( $message->header($h) );
        }
    }
}

# create and start the proxy
my $proxy = HTTP::Proxy->new(@ARGV);

# if we want to look at SOME sites
if (@{$args{peek}}) {
    for (@{$args{peek}}) {
        $proxy->push_filter(
            host    => $_,
            method  => 'POST',
            request => $post_filter
        );
        $proxy->push_filter( host => $_, response => $get_filter );
    }
}
# otherwise, peek at all sites
else {
    $proxy->push_filter(
        method  => 'POST',
        request => $post_filter
    );
    $proxy->push_filter( response => $get_filter );
}

$proxy->start;

Reply via email to