Re: Mechanize - redirect problem

2005-02-23 Thread Peter Stevens
Hi Martin, I have written scrapers for a number of different sites, all of which require sign on, and have not had any problems with redirection not being performed. Hmm. Haven't done anything with sunrise yet (although I would like to automatically download my faxes from my sunrise onebox).

Help help! Writer trying to program!

2005-02-23 Thread Andrew Johnson
Hello there, I wrote a script to scrape businessweek's search results. It worked fine, but now I am trying to authenticate my agent to businessweek first, before I do my search, so that my search results don't point at register pages, and so I can access the results and parse them. I

Re: Help help! Writer trying to program!

2005-02-23 Thread Andy Lester
On Wed, Feb 23, 2005 at 11:41:58AM -0500, Andrew Johnson ([EMAIL PROTECTED]) wrote: code is ghetto, but that's because I did not understand the better Perl HTML parsing modules. Go take a look at WWW::Mechanize first. Much of your parsing for links is handled for you. Also, make sure you

Re: Mechanize - redirect problem

2005-02-23 Thread Martin Kos
hi peter I have written scrapers for a number of different sites, all of which require sign on, and have not had any problems with redirection not being performed. Hmm. Haven't done anything with sunrise yet (although I would like to automatically download my faxes from my sunrise onebox). i

Re: Mechanize - redirect problem

2005-02-23 Thread Andy Lester
On Wed, Feb 23, 2005 at 06:51:09PM +0100, Martin Kos ([EMAIL PROTECTED]) wrote: One of the best tips I've gotten from this list is to put use LWP::Debug qw(+); in to your code. This turns on a trace so you can Can one of you guys please write up a paragraph on that LWP::Debug trick so that I

Re: Return HTML (not text) between tags with HTML::Parser?

2005-02-23 Thread Daniel Leonard
Hi. I had to parse a rather large number of web pages and I needed to do exactly the same thing. What I did was to set: $/ = tag; This sets the end of line to what ever is between the quotes. Thus when you read in a line you will move from, in your case, tag A to A. Just be sure that you

Re: Mechanize - Scraping Tips

2005-02-23 Thread Peter Stevens
Hi Andy, Here are some of the problems that I have had which were real brain-teasers along with some thoughts on how to solve them, including the LWP::Debug trick. Feel free to republish, but please cite the source. Cheers, Peter Q: How do I figure out why $mech-get($url) doesn't work, times

Re: Mechanize - Scraping Tips

2005-02-23 Thread Martin Kos
hi andy i have had a problem with javascript today... the javascript added some hidden fields that were not in the html and i have had to find a way to add them with mechanize. i have the following code in the ML archive and it just worked fine! so you could add this example