Hey List- using Perl 5.8 and the most recent mech release.
I wrote a site spider (internal site) that searches through all the HTML links and looks at each page for various pieces of information. Problem I'm running into is links that link to large PDF or PPT files and them clogging up the works. I'm trying to figure out how to just download the headers so I can determine if the file is HTML or not (via $mech->is_html() ) and if it isn't just skip it. It seems the $mech->get($url) method still loads the whole file before I can just look at the headers. The Mech docs say that the get function is overloaded from the UserAgent function. Now, reading the docs on that it has a size limiter. How can I limit the size of the download? I don't really care as long as I just grab the headers. I was also trying to move in the direction of just creating a new UserAgent object and using the HTTP::Request function to grab just the headers. Anyone have any better ideas? Thanks. Henrik -- Henrik Hudson [EMAIL PROTECTED] RTFM: Not just an acronym, it's the LAW!