RE: How to extract the full URL from a relative link using WWW: : Mecha nize

Rajesh Dorairajan Mon, 29 Dec 2003 18:42:59 -0800

Rob,

Sorry for the ambiguity in my email. Your solution worked right away.


To answer your questions, @links is an array that contains the list of links
in the page as www:mechanize::links objects. The argument that I used in the
GET method is a hack that actually is available in LWP::UserAgent get
method. Since www::mechanize basically inherits the get method from LWP it
enables me to follow a link and save the content as a file. Much like
right-clicking on a link and "Save target as.." in IE.

Thanks a lot for your help

Rajesh

> -----Original Message-----
> From: Rob Dixon [mailto:[EMAIL PROTECTED]
> Sent: Thursday, December 25, 2003 4:59 AM
> To: [EMAIL PROTECTED]
> Subject: Re: How to extract the full URL from a relative link 
> using WWW:
> : Mecha nize
> 
> 
> Rajesh Dorairajan wrote:
> >
> > I am using WWW::Mechanize to create a configuration file 
> from a website and
> > my script needs to go through a web-page and copy a 
> specific links into a
> > configuration file. I am using $m->links method that 
> returns the list of
> > links from a page. However, I am not able to get the fully 
> qualified URL
> > such as http://www.domain.com/dir1/dir12/file1.html. Instead I get
> > /dir1/dir12/file.html. Please forgive me if this doesn't 
> make sense. I am
> > giving the code below for more clarity:
> >
> > my $obj = WWW::Mechanize->new();
> > my $url = "http://www.domain.com/dir1";;
> > .
> > .
> >
> > for my $link ( @links ) {
> >     my $url = $link->url;
> >     $obj->get ( $url );
> >     my @urls = $obj->links;
> >
> >
> >     $obj->get( $urls[1]->url, 
> ":content_file"=>"$dir/tmp/tmpfile.cert" );
> >     .
> >     .
> >
> >     print CRL "[INPUT_SECTION_$count]\n";
> >     print CRL "LOCATION=".$urls[2]->url."\n\n"; #This is 
> where I need the
> > fully qualified URL instead of the relative URL that I get currently
> >     $count++;
> > }
> >
> > I looked up a lot on the web before sending this mail. I am 
> not able to find
> > any documentation that points to this. Any help will be 
> deeply appreciated.
> 
> Hi Rajesh.
> 
> I'm puzzled by your code. What's in your @links array that 
> has a 'url' method?
> And I don't know of a WWW::Mechanize::get method that takes 
> two parameters.
> I have the latest WWW::Mechanize from CPAN. Are you doing something
> inscrutably clever?
> 
> Anyway, the answer is to use the URI::URL module (already 
> 'require'd by
> WWW::Mechanize) to build the absolute URL from the href= 
> field of the anchors
> like this:
> 
>   use strict;
>   use warnings;
> 
>   use WWW::Mechanize;
> 
>   my $mech = new WWW::Mechanize;
> 
>   $mech->get('http://search.cpan.org/');
> 
>   foreach (@{$mech->links}) {
>     print URI::URL->new_abs($_->[0], $mech->base), "\n";
>   }
> 
> I hope this helps.
> 
> Happy Christmas!
> 
> Rob
> 
> 
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
> 
> 
>

RE: How to extract the full URL from a relative link using WWW: : Mecha nize

Reply via email to