Thanks a lot Markus for your answer. My English is not so good.
I was reading but i don’t know how to fix the problems yet. Could you explain 
me in details the solution please. I was looking in conf directory but I can't 
find how to map one mime types to another. I need to replace index-more plugin 
? 
I was looking in the link that you suggest me and a saw a 
NUTCH-1262-1.5-1.patch but I don’t know how to use that patch.
Please tell me if I need to delete the index completely or there is a way to 
replace an application/xhtml+xml to text/html in solr index.




-----Mensaje original-----
De: Markus Jelsma [mailto:[email protected]] 
Enviado el: domingo, 25 de noviembre de 2012 4:33 AM
Para: [email protected]
Asunto: RE: problem with text/html content type of documents appears 
application/xhtml+xml in solr index

Hi - trunk's more indexing filter can map mime types to any target. With it you 
can map both (x)html mimes to text/html or to `web page`.

https://issues.apache.org/jira/browse/NUTCH-1262

 
-----Original message-----
> From:Eyeris Rodriguez Rueda <[email protected]>
> Sent: Sun 25-Nov-2012 00:48
> To: [email protected]
> Subject: problem with text/html content type of documents appears 
> application/xhtml+xml in solr index
> 
> Hi.
> 
> I have changed my nutch version from 1.4 to 1.5.1 and I have detected a 
> problem with content type of some document, some pages with text/html appears 
> in solr index with application/xhtml+xml , when I check the links the 
> navegator tell me that efectively is text/html.
> Any body can help me to fix this problem, I think change this content type 
> manually in solr index to text/html but is not a good way for me.
> Please any suggestion or advice will be accepted.


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to