Hello all.
I am using nutch 1.9(local mode) and solr 4.10.3
I have detected that some pages will appear duplicates in solr with diferent 
url but the same information
This are two examples of url

http://www.cubadebate.cu/noticias/2012/07/06/cientificos-espanoles-trabajan-en-gel-para-prevenir-el-sida/
http://www.cubadebate.cu/noticias/2012/07/06/cientificos-espanoles-trabajan-en-gel-para-prevenir-el-sida/comment-page-1/

How nutch try with duplicate pages? 
The solution must be in nutch or in solr?
Any body can suggest me any way to avoid and solve that problem? 
17 de octubre: Final Cubana 2015 del Concurso de Programación ACM-ICPC.
http://coj.uci.cu/contest/contestview.xhtml?cid=1407

Reply via email to