Hi Bret, Thanks for the swift reply.
I was aware of the deficiency of Activestate's PPM, having had the same problem not finding Crypt-SSLeay. However, I know that's not the solution in itself since I had tried my scripts on a Linux box which had this installed. I also turned the cookie jar on (not knowing what it was doing particularly) and that did not get me any further. The advice you give about proxy servers looks very interesting though, so I shall download SSLeay (am now writing these scripts on Windows, mainly because I find using IE is the easiest way to develop web scraping scripts concurrently with the Perl script rather than any advantage of Perl on Windows) and try that. Regards Colin -----Original Message----- From: Bret Swedeen [mailto:[EMAIL PROTECTED] Sent: 15 July 2004 13:09 To: [EMAIL PROTECTED]; Colin Magee Subject: Re: Viewing exchange between browser and website Hi Colin, I recently attempted a similar task. I'll try to outline as clearly as possible what worked. Using some of the examples from "Perl & LWP" book was a disaster. None of it worked. One important point I did pick up however was to make sure you have cookies enabled: use HTTP::Cookies; $agent->cookie_jar(HTTP::Cookies->new()); Once I added those lines things started to look promising. Unfortunately, I was still having problems. Primary reason: I needed an extra Perl Mod to make communication across https possible. I'm using Perl for Win32 so I needed to install the Perl Mod Crypt::SSLeay. Problem was doing so from the ppm prompt (part of the ActiveState Perl installation...makes mod installation very easy) wasn't working. For whatever reason I couldn't find Crypt::SSLeay for Perl on Win32. Finally, after searching forums on ActiveState I found the mod and installed from the ppm prompt with the following command: install http://theoryx5.uwinnipeg.ca/ppms/Crypt-SSLeay.ppd Take the defaults through the entire installation (there are a couple of DDLs that it will ask you about as well. Just answer yes). Ok, now I'm getting real close, but still not working. I posted on the Usenet forum for Perl Mods and got two extremely helpful tips. First, install a local proxy of sorts to capture and view the back and forth communication between browser and web site. Something I think you are looking for now. Proxomitron was what I used. I turned it on and went through the web interaction steps with a standard browser. While this tool didn't really resolve my problems, it did help me understand more of what was going on between the browser and the site. Second, and the most helpful of all, install the Perl Mod WWW::Mechanize. This mod allows you to easily automate the steps of interacting with a site. From simply following links to logging on and communicating over https. This mod was what finally worked for me. There was a problem with pressing certain buttons on the page. Seems it doesn't really know what to do with Javascript buttons, but I worked around that by simply making a URL with all of the form variables set and passed it in to get what I wanted. May not be a problem for you, but keep in mind that it really doesn't work with all form buttons exactly as you might think. Anyway, another very useful thing during script development is to turn on the LWP debugging. With this turned on you get to see all of the communication details between your script and the site. It really helps with troubleshooting as you can see exactly where things are falling apart. Add this line near the top of your script after the use LWP statement. use LWP::Debug qw(+); Anyway, my experience was somewhat frustrated but little by little I did make progress and finally resolve my problem. Here is a quick glimpse at what I put together. Please keep in mind I had to remove some detail as it is company specific which I cannot disclose here. Also, at the end I dump the page content that I get back after I send $bigprobeurl into a file with an html extension. I would then open this file in a browser to see if I got what I wanted. The final version removes some of this code and actually acts upon the page returned. I believe, however, this example should help get you closer to what you want. Of course, as I found, no one example addresses your problem exactly they way you need. Keep working on it. You'll get there in the end. use LWP; use LWP::UserAgent; use LWP::Protocol::https; use LWP::Debug qw(+); use WWW::Mechanize; use HTTP::Cookies; my $agent = WWW::Mechanize->new(); my $intranetsite = "http://some company intranet site/index.html"; my $bigurl = "https://big url here with form variables and their values"; $numargs = @ARGV; # check for username and password on the command line if ($numargs == 2) { $un = $ARGV[0]; $pw = $ARGV[1]; } else { print "Please enter your username: "; my $un = <STDIN>; chomp($un); print "Please enter your password: "; my $pw = <STDIN>; chomp($pw); } $agent->cookie_jar(HTTP::Cookies->new()); $agent->agent_alias( 'Windows IE 6' ); #Navigate the intranet web site $agent->get($intranetsite); $agent->follow("Sign In"); # a link on the page $agent->form_name('login'); # this is the name of the form on the sign in page $agent->field(username => "$un"); $agent->field(password => "$pw"); $agent->click(); # this is where I simulate clicking the button on the login page $agent->follow("Internal Application Link"); # a link on the new page $agent->follow("Application Charts"); # a link on the next new page $agent->get($bigurl); # finally, I send the URL wth form variables and values open(LOGFILE, ">output.html"); $page = $agent->content(); print LOGFILE "$page"; # dump page content into a file for viewing in a browser close(LOGFILE); __END__ On 15 Jul 2004 at 12:14, Colin Magee wrote: > Hi, > > I've been trying to use LWP to programatically log in to a favourite > password protected website. > > Problem is that I've worked through all the standard examples on LWP > and I'm not getting through - the login mechanism doesn't conform to > the examples, so I was wondering if there is any way I can see exactly > what my browser is sending and receiving (while I'm using the browser) > and therefore what I have to replicate in the code. As you can > probably tell I'm fairly novicey so I need to see some output where it > will be fairly clear what I have to code in Perl. I seem to recall > some thread on this forum about using Mechanise in this way. Is that > correct? If so is there an example script that shows how to record > this? > > Thanks > Colin >