I was very disappointed when Jonathan Helfman's research disappeared
from the Web --- it was there a week or two ago. I tried to browse it
through the Wayback Machine, but the Wayback Machine doesn't rewrite
links. (It seems to try, but it relies on browser-side JavaScript to
rewrite links and inline image URLs. I was using a browser that
didn't support JavaScript.)
So I wrote this (single-threaded, slow, stupid, fragile,
HTTP/1.0-only, but interesting) non-caching HTTP proxy. It lets you
browse the web through Google's cache or the Wayback Machine by
rewriting your requests on the fly.
#!/usr/bin/perl -w
use strict;
use IO::Socket;
my $port = $ARGV[0] || 8080;
$ARGV[1] ||= 'google';
my $config;
if ($ARGV[1] eq 'google') {
$config = { PeerAddr => 'www.google.com:80',
nosub => quotemeta('http://www.google.com'),
replacement => '/search?q=cache:'};
} else {
my $date = $ARGV[2] || '19970205193002';
$config = { PeerAddr => 'web.archive.org:80',
nosub => quotemeta('http://web.archive.org'),
replacement => "/web/$date/"};
}
my $server = new IO::Socket::INET(LocalPort => $port, Listen => 42,
Reuse => 1);
$| = 1;
my $brokenpipe = 0;
sub brokenpipe {
$brokenpipe = 1;
}
$SIG{PIPE} = \&brokenpipe;
for (;;) {
$brokenpipe = 0;
my $socket = $server->accept();
my $outsocket = new IO::Socket::INET(PeerAddr => $config->{PeerAddr});
my $reqline = <$socket>;
$reqline =~ s| http://| $config->{replacement}|
unless $reqline =~ m| $config->{nosub}|;
$reqline =~ s|HTTP/1.*|HTTP/1.0|;
print $reqline;
print $outsocket $reqline;
while (not $brokenpipe and defined($_ = <$socket>)) {
$_ = "Host: $config->{PeerAddr}\r\n" if /^Host: .*/;
#print;
print $outsocket $_;
last if /^\r?$/;
}
print "-- now reading from outsocket\n";
my $gotstuff = 0;
while (not $brokenpipe and defined($_ = <$outsocket>)) {
$gotstuff++;
print ".";
print $socket $_;
}
print "\n";
if (not $gotstuff) {
print "No data; brokenpipe is $brokenpipe and errno is $!\n";
}
close $outsocket;
close $socket;
}
--
<[EMAIL PROTECTED]> Kragen Sitaker <http://www.pobox.com/~kragen/>
I don't do .INI, .BAT, .DLL or .SYS files. I don't assign apps to files. I
don't configure peripherals or networks before using them. I have a computer
to do all that. I have a Macintosh, not a hobby. -- Fritz Anderson