On 21 Apr 2008, at 02:43, Aristotle Pagaltzis wrote:
* Kieren Diment <[EMAIL PROTECTED]> [2008-04-20 15:45]:
perl -e 'for $i (1 .. 48) { system "lynx -dump -nolist
http://www.jrock.us/fp2008/catalyst/slide$i.html"}' > jrock.txt
I would have proposed that, but any multipart slides are called
things like slide17a.html, slide17b.html etc so your script won’t
work.
It's a fair cop.
#!/usr/bin/perl
use warnings;
use strict;
use WWW::Mechanize;
my $m = WWW::Mechanize->new;
$m->get("http://www.jrock.us/fp2008/catalyst/slide1.html");
print $m->content;
while ($m->follow_link(text_regex=>qr/Next/)) { print $m-
>content; } ;
Then pipe that into lynx -stdin -nolist -width 9999 # evil but works
well enough.
(And anyway, for that sort of thing you should be using curl:
curl -O 'http://www.jrock.us/fp2008/catalyst/slide[1-48].html'
except of course that this won’t work for the same reason.)
I'd never read the curl manpage before - thanks, might be handy for
me. With lwp and www::mechanize on the shelf next to it, I generally
reach for them first.
_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[email protected]/
Dev site: http://dev.catalyst.perl.org/