I use nutch 0.7 for better recognition of xls file. But now I get this error that I didn't get under the version 0.6 :
http://XXX.XXX.XXX.XXX/tostaky.txtorg.apache.nutch.util.mime.MimeTypeException: Invalid Sub Type plain; charset=iso-8859-1 <http://xxx.xxx.xxx.xxx/tostaky.txtorg.apache.nutch.util.mime.MimeTypeException:%20Invalid%20Sub%20Type%20plain;%20charset=iso-8859-1> I get the same kind of error with html files : ...org.apache.nutch.util.mime.MimeTypeException: Invalid Sub Type html: charset=iso-8859-1. Did somebody meet the same problem, and in this case, how to do to 'repair' my crawl. Thank you all Regards. Marc Delerue [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED] \_@< plop !
