Hi Martin,
I have written scrapers for a number of different sites, all of which
require sign on, and have not had any problems with redirection not
being performed. Hmm. Haven't done anything with sunrise yet (although I
would like to automatically download my faxes from my sunrise onebox).
Hello there,
I wrote a script to scrape businessweek's search results. It
worked fine, but now I am trying to authenticate my agent to businessweek
first, before I do my search, so that my search results don't point at
register pages, and so I can access the results and parse them. I
On Wed, Feb 23, 2005 at 11:41:58AM -0500, Andrew Johnson ([EMAIL PROTECTED])
wrote:
code is ghetto, but that's because I did not understand the better Perl HTML
parsing modules.
Go take a look at WWW::Mechanize first. Much of your parsing for links
is handled for you.
Also, make sure you
hi peter
I have written scrapers for a number of different sites, all of which
require sign on, and have not had any problems with redirection not
being performed. Hmm. Haven't done anything with sunrise yet (although I
would like to automatically download my faxes from my sunrise onebox).
i
On Wed, Feb 23, 2005 at 06:51:09PM +0100, Martin Kos ([EMAIL PROTECTED]) wrote:
One of the best tips I've gotten from this list is to put use
LWP::Debug qw(+); in to your code. This turns on a trace so you can
Can one of you guys please write up a paragraph on that LWP::Debug trick
so that I
Hi. I had to parse a rather large number of web pages
and I needed to do exactly the same thing. What I did
was to set:
$/ = tag;
This sets the end of line to what ever is between the
quotes. Thus when you read in a line you will move
from, in your case, tag A to A. Just be sure that
you
Hi Andy,
Here are some of the problems that I have had which were real
brain-teasers along with some thoughts on how to solve them, including
the LWP::Debug trick. Feel free to republish, but please cite the source.
Cheers,
Peter
Q: How do I figure out why $mech-get($url) doesn't work, times
hi andy
i have had a problem with javascript today... the javascript added some
hidden fields that were not in the html and i have had to find a way to
add them with mechanize. i have the following code in the ML archive and
it just worked fine! so you could add this example