Hi. Markus.
I was doing your recommendations but, my problem persist, some documents still
with application/xhtml+xml instead of text/html.
I add the property to nutch-site.xml and make the conf/contenttype-mapping.txt
file
<property>
<name>moreIndexingFilter.mapMimeTypes</name>
<value>true</value>
</property>
I'm using nutch 1.5.1. Tell me if I need to replace index-more.jar in plugin
directory with any fixed version ?
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci