Hi - trunk's more indexing filter can map mime types to any target. With it you 
can map both (x)html mimes to text/html or to `web page`.

https://issues.apache.org/jira/browse/NUTCH-1262

 
 
-----Original message-----
> From:Eyeris Rodriguez Rueda <[email protected]>
> Sent: Sun 25-Nov-2012 00:48
> To: [email protected]
> Subject: problem with text/html content type of documents appears 
> application/xhtml+xml in solr index
> 
> Hi.
> 
> I have changed my nutch version from 1.4 to 1.5.1 and I have detected a 
> problem with content type of some document, some pages with text/html appears 
> in solr index with application/xhtml+xml , when I check the links the 
> navegator tell me that efectively is text/html.
> Any body can help me to fix this problem, I think change this content type 
> manually in solr index to text/html but is not a good way for me.
> Please any suggestion or advice will be accepted.
> 
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
> 

Reply via email to