Folks,

I'm having a very similar problem with the latest svn version of Nutch
(Revision: 279844).

The crawler returns this message: fetch okay, but can't parse [URL
scrubbed]/pub/Presentation.ppt, reason: failed(2,203): Content-Type not
text/html: application/vnd.ms-powerpoint

So in this case, the MIME type is correct, so the file should be passed
to the parse-mspowerpoint plugin, but it's not. Now that the plugin has
been committed, how do we actually make it work (yes, I've read 
http://issues.apache.org/jira/browse/NUTCH-88 )?

Thanks,
Renat


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to