Thanks a lot Talat :), I truly appreciate your help, and the others persons 
that gave me ideas

I fixed Solr schema, following the Nutch Tutorial I had changed the line: 
<field name="content" type="text_general" stored="true" indexed="true"/> for 
<field name="content" type="text" stored="true" indexed="true"/>, but this is 
wrong
I fixed that and ran again the nutch 1.7 but still getting problems :( , you 
can see a new hadoop.log here:  http://pastebin.com/2qY0sUJh
The errors are:
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
        at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)
        at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)
        at org.apache.nutch.crawl.Crawl.run(Crawl.java:160)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

Any ideas are wellcomed!!!
Thanks in advance,
Luis Armando

________________________________________
De: Talat UYARER [[email protected]]
Enviado el: viernes, 18 de octubre de 2013 03:39 p.m.
Para: [email protected]
Asunto: Re: Nutch 1.7 and Solr 4.4.0 Integrate

Ok Luis,

I found your problem. :) You have a problem about Solr Schema. In your
hadoop.log you can see this line:

 1.
    org.apache.solr.common.SolrException: {msg=SolrCore 'collection1' is
    not available due to init failure: Unknown fieldType 'text'
    specified on field
    content,trace=org.apache.solr.common.SolrException: SolrCore
    'collection1' is not available due to init failure: Unknown
    fieldType 'text' specified on field content        at
    org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:860)
    at
    
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:251)
         at
    
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
         at
    
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at
    org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
       at
    
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
            at
    org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
      at
    
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
        at
    
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
       a


As you see, When nutch try to commit Solr throw an exception. You should
check your Solr schema. You can ask me why does solrdedup throw an
exception. Because IndexerJob didnt commit your document to Solr. When
try to run dedup it didnt find any document check for duplication.

Talat


La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. 
Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. 
http://www.congresouniversidad.cu/


Reply via email to