> > > http://XXX.XXX.XXX.XXX/tostaky.txtorg.apache.nutch.util.mime.MimeTypeException: > > Invalid Sub Type plain; charset=iso-8859-1 < > http://xxx.xxx.xxx.xxx/tostaky.txtorg.apache.nutch.util.mime.MimeTypeException:%20Invalid%20Sub%20Type%20plain;%20charset=iso-8859-1 > > > I get the same kind of error with html files : > ...org.apache.nutch.util.mime.MimeTypeException: Invalid Sub Type html: > charset=iso-8859-1. > > Did somebody meet the same problem, and in this case, how to do to > 'repair' my crawl.
This is due to a bug in the patch I submitted about ContentType detection. I will correct it as soon as possible. Could you create an issue on jira please. Jerome -- http://motrech.free.fr/ http://frutch.free.fr/
