I was browsing around (perl.org) and found (http://www.perl.org/books/beginning-perl/) which has a bunch of html links to PDFs to download for the book Beginning Perl. What I wanted to do was to parse the html for *pdf links then use File::Fetch to get the PDFs, there might of been a module for this but not sure how it would act on http links in multiline comments, so I thought I would just create something.
Well first night was up til 2:30AM trying to think of regexp to parse the PDF links, which didn't go to well. I also noticed that some of the html links were commented out which I didn't want in my results, just LIVE links. So I had to work out a way to parse for single/multi line comments and discard accordingly, but to make sure that there were no links prematch of (<!--) and postmatch (-->). So after two nights I think I have all the bases covered. Though currently my program prints the html with the results. I haven't parsed out the http links yet, that is next, then fetching. http://djgoku.dyndns.org/dj_goku/get_http.pl _______________________________________________ kc mailing list [email protected] http://mail.pm.org/mailman/listinfo/kc
