> Is it not supposed to be the other way around, Nutch needing to be more > complacent with old servers that return "application/powerpoint"? The > thing is, there are some servers out there which _do_ return that MIME > Type, and supposedly, one would want to index them as well... As we > can't hack all those servers to fix the MIME Types, Nutch should IMHO > accomodate those sites too. > Or am I talking crap there?
Yes, you are rigth, but my response was a short time solution. 1. A quick solution could be to checsk that a plugin can be associated to many content-types (if so, there's just to add application/powerpoint in the mspowerpoint plugin xml). 2. Remember that powerpoint plugin is not part of the Nutch-0.7 release... Regards Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
