Re: files download and performance

Tatiana Lloret Iglesias Mon, 22 Jan 2007 07:18:15 -0800

does it work for this kind of urls?
http://patft.uspto.gov/netacgi/nph-Parser?
thanks!
T.



On 1/22/07, Igor Sutton <[EMAIL PROTECTED]> wrote:


Hi Tatiana,

2007/1/22, Tatiana Lloret Iglesias <[EMAIL PROTECTED]>:
> i've realized that for each link, i spend most of the time in the
following
> perl script
>
> foreach my $url (@lines){ -- I READ MY 1-ROW URL FILE
>         $contador=2;
>         $test=0;
>         while(!$test){
>
>             $browser2->get($url);
>             $content = $browser2->content();
>
> --IN THESE 2 STEPS I SPEND 6 SECONDS for a 86 kb html, Is it ok? Can i
> perform these 2 steps faster?
>

Are you using the domain name or the ip address in link (e.g.
http://www.google.com/ or http://1.2.3.4)? If you are using the first,
perl will first contact your DNS server or cache, and then connect and
retrieve the contents you want. If you are not using a DNS cache, you
can build it using Net::DNS and Memoize for caching.

Check the example:

<code>
#!env perl

use strict;
use warnings;

use Benchmark::Timer;
use Carp;
use Memoize;
use Net::DNS;

# used by get_ip_from_hostname
my $resolver = Net::DNS::Resolver->new;

sub get_ip_from_hostname {
   my ($hostname) = @_;
   my $query = $resolver->search($hostname);
   if ($query) {
       foreach my $rr ( $query->answer ) {
           next unless $rr->type eq 'A';
           return $rr->address;
       }
   }
   else {
       croak "Query failed: ", $resolver->errorstring;
   }
}

my $t = Benchmark::Timer->new();

for ( 1 .. 1000 ) {
   $t->start('get_ip_from_hostname without memoize');
   my $ip = get_ip_from_hostname("www.google.com");
   $t->stop('get_ip_from_hostname without memoize');
}
print $t->report();

$t->reset();

memoize('get_ip_from_hostname');
for ( 1 .. 1000 ) {
   $t->start('get_ip_from_hostname memoize');
   my $ip = get_ip_from_hostname("www.google.com");
   $t->stop('get_ip_from_hostname memoize');
}

print $t->report();

</code>

HTH!

--
Igor Sutton Lopes <[EMAIL PROTECTED]>

Re: files download and performance

Reply via email to