Hi All, I'm wondering what the best way is to add the ability to crawl media URLs efficiently in Nutch. I don't always need to download all the cotent at the URL (e.g. a video) but would probably like to do a head request for certain media types to check the content type and the content length. How should I go about this? I'm using Nutch from SVN.
Cheers, Pablo Mayrgundter
