Hello
I added the lines that Mourdak suggested me, but I still getting the same
errors:
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default
solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : use authentication (default false)
solr.auth : username for authentication
solr.auth.password : password for authentication
Indexer: finished at 2013-10-18 14:39:23, elapsed: 00:00:04
SolrDeleteDuplicates: starting at 2013-10-18 14:39:23
SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:160)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
Any other idea???
thanks for your time,
Luis Armando
________________________________________
De: Mouradk [[email protected]]
Enviado el: viernes, 18 de octubre de 2013 09:08 a.m.
Para: [email protected]
Asunto: Re: Nutch 1.7 and Solr 4.4.0 Integrate
Hi Luis,
Under you nutch-site.xml configuration file you need to add the SOLR indexer
plugin:
<property>
<name>plugin.includes</name>
<value>protocol-http|parse-(html|tika)|index-(basic|anchor)|indexer-solr</value>
</property>
Hope this help,
Mourad
On 18 Oct 2013, at 15:05, Luis Armando Roca Fumero <[email protected]> wrote:
> Hello friends:
> I had configurated nutch 1.7 and solr 4.4.0 to work together, by Nutch
> Tutorial paper
> When I run the command: ./bin/nutch crawl urls -solr
> http://localhost:8983/solr/ -depth 3 -topN 5 > test.txt
> All works good, but finally when Indexer is starting I get errors like this:
>
> Indexer: starting at 2013-10-18 13:57:32
> Indexer: deleting gone documents: false
> Indexer: URL filtering: false
> Indexer: URL normalizing: false
> Active IndexWriters :
> SOLRIndexWriter
> solr.server.url : URL of the SOLR instance (mandatory)
> solr.commit.size : buffer size when sending to SOLR (default 1000)
> solr.mapping.file : name of the mapping file for fields (default
> solrindex-mapping.xml)
> solr.auth : use authentication (default false)
> solr.auth.username : use authentication (default false)
> solr.auth : username for authentication
> solr.auth.password : password for authentication
>
>
>
> What Can I do, what is wrong?? I have not idea, I had tried with Nutch 2.2.1
> and doesn't work with solr 4.4.0 either. I need a tutorial to integrate nutch
> with solr, like baby steps :)
> Thanks in advance
>
> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
> Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu
> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba.
> http://www.congresouniversidad.cu/
>
>
La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu
Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba.
http://www.congresouniversidad.cu/
La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu
Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba.
http://www.congresouniversidad.cu/