Best way to index large files without fully downloading?

Pablo Mayrgundter Mon, 13 Jun 2005 13:42:43 -0700

Hi All,

I'm wondering what the best way is to add the ability to crawl media
URLs efficiently in Nutch.  I don't always need to download all the
cotent at the URL (e.g. a video) but would probably like to do a head
request for certain media types to check the content type and the
content length.  How should I go about this?  I'm using Nutch from
SVN.


Cheers,
Pablo Mayrgundter

Best way to index large files without fully downloading?

Reply via email to