You need to use the schema.xml shipped with Nutch in Solr. It provides most 
fields that you need.

On Tuesday 10 May 2011 17:31:33 Gabriele Kahlout wrote:
> I don't get you, are you talking about conf/schema.xml? That's what I'm
> referring to. Am i supposed to do something with the nutch's
> conf/schema.xml?
> 
> On Tue, May 10, 2011 at 4:46 PM, Markus Jelsma
> 
> <markus.jel...@openindex.io>wrote:
> > There is a working example schema in Nutch' conf directory.
> > 
> > On Tuesday 10 May 2011 16:40:02 Gabriele Kahlout wrote:
> > > From solr logs:
> > > 
> > > May 10, 2011 4:33:20 PM org.apache.solr.common.SolrException log
> > > *SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field
> > > 'content' *
> > > 
> > >     at
> > 
> > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:32
> > 1)
> > 
> > >     at
> > 
> > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateP
> > ro
> > 
> > > cessorFactory.java:60) at
> > > org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147) at
> > > org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
> > > 
> > >     at
> > 
> > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conten
> > tS
> > 
> > > treamHandlerBase.java:55) at
> > 
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBa
> > se
> > 
> > > .java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
> > > 
> > >     at
> > 
> > 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
> > > 356) at
> > 
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
> > va
> > 
> > > :252) at
> > 
> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
> > io
> > 
> > > nFilterChain.java:244) at
> > 
> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
> > rC
> > 
> > > hain.java:210) at
> > 
> > org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFil
> > te
> > 
> > > r.java:393) at
> > 
> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
> > io
> > 
> > > nFilterChain.java:244) at
> > 
> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
> > rC
> > 
> > > hain.java:210) at
> > 
> > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve
> > .j
> > 
> > > ava:240) at
> > 
> > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve
> > .j
> > 
> > > ava:161) at
> > 
> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:
> > 16
> > 
> > > 4) at
> > 
> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:
> > 10
> > 
> > > 0) at
> > > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55
> > > 0)
> > > 
> > >     at
> > 
> > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.j
> > av
> > 
> > > a:118) at
> > 
> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:38
> > 0)
> > 
> > >     at
> > 
> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243
> > )
> > 
> > >     at
> > 
> > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
> > tt
> > 
> > > p11Protocol.java:188) at
> > 
> > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
> > tt
> > 
> > > p11Protocol.java:166) at
> > 
> > org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.ja
> > va
> > 
> > > :288) at
> > 
> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
> > .j
> > 
> > > ava:886) at
> > 
> > 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> > > 908) at java.lang.Thread.run(Thread.java:680)
> > > 
> > > in conf/schema.xml:
> > >    <!-- fields for index-basic plugin -->
> > >    
> > >         <field name="host" type="url" stored="false" indexed="true"/>
> > >         <field name="site" type="string" stored="false"
> > >         indexed="true"/> <field name="url" type="url" stored="true"
> > >         indexed="true"
> > >         
> > >             required="true"/>
> > > 
> > > *        <field name="content" type="text" stored="false"
> > 
> > indexed="true"/>*
> > 
> > > in conf/solrindex-mapping.xml:
> > >     <fields>
> > >     
> > >         <field dest="content" source="content"/>
> > > 
> > > In recent solr I think this has been renamed into text?
> > > 
> > > Solr's conf/schema.xml:
> > >         via copyField further on in this schema  -->
> > > 
> > > *   <field name="text" type="text" indexed="true" stored="false"
> > > multiValued="true"/>*
> > > 
> > > On Tue, May 10, 2011 at 4:30 PM, Gabriele Kahlout
> > > 
> > > <gabri...@mysimpatico.com>wrote:
> > > > It apparently is normal, and my issue is indeed with nutch.
> > > > 
> > > > I've modified post.sh from the example docs to use the solr in
> > > > http://localhost:8080/apache-solr-3.1-SNAPSHOT and now finally data
> > 
> > made
> > 
> > > > it to the index.
> > > > $ post.sh solr.xml monitor.xml
> > > > 
> > > > With nutch I'm at:
> > > > 
> > > > $ svn info
> > > > Path: .
> > > > URL: http://svn.apache.org/repos/asf/nutch/branches/branch-1.3
> > > > Repository Root: http://svn.apache.org/repos/asf
> > > > Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> > > > Revision: *1101459*
> > > > Node Kind: directory
> > > > Schedule: normal
> > > > Last Changed Author: markus
> > > > Last Changed Rev: 1101280
> > > > Last Changed Date: 2011-05-10 02:46:04 +0200 (Tue, 10 May 2011)
> > > > 
> > > > Does this work for you? All I've done is svn co nutch 1.3 and execute
> > 
> > my
> > 
> > > > script which up to now worked.
> > > > 
> > > > 
> > > > 
> > > > On Tue, May 10, 2011 at 4:11 PM, Gabriele Kahlout <
> > > > 
> > > > gabri...@mysimpatico.com> wrote:
> > > >> Hello,
> > > >> 
> > > >> I'm having trouble getting Solr 3.1 to work with nutch-1.3.  I'm not
> > > >> sure where the problem is, but I'm wondering why does the solrHome
> > 
> > path
> > 
> > > >> end with /./.
> > > >> 
> > > >> cwd=/Applications/NetBeans/apache-tomcat-7.0.6/bin
> > > >> SolrHome=/Users/simpatico/apache-solr-3.1.0/solr/./
> > > >> 
> > > >> In the web.xml of solr:
> > > >>    <env-entry>
> > > >>    
> > > >>        <env-entry-name>solr/home</env-entry-name>
> > > >> 
> > > >> <env-entry-value>${user.home}/apache-solr-3.1.0/solr</env-entry-valu
> > > >> e>
> > > >> 
> > > >>        <env-entry-type>java.lang.String</env-entry-type>
> > > >>     
> > > >>     </env-entry>
> > > >> 
> > > >> --
> > > >> Regards,
> > > >> K. Gabriele
> > > >> 
> > > >> --- unchanged since 20/9/10 ---
> > > >> P.S. If the subject contains "[LON]" or the addressee acknowledges
> > > >> the receipt within 48 hours then I don't resend the email.
> > > >> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> > > >> time(x) < Now + 48h) ⇒ ¬resend(I, this).
> > > >> 
> > > >> If an email is sent by a sender that is not a trusted contact or the
> > > >> email does not contain a valid code then the email is not received.
> > > >> A valid code starts with a hyphen and ends with "X".
> > > >> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧
> > > >> y
> > 
> > ∈
> > 
> > > >> L(-[a-z]+[0-9]X)).
> > > > 
> > > > --
> > > > Regards,
> > > > K. Gabriele
> > > > 
> > > > --- unchanged since 20/9/10 ---
> > > > P.S. If the subject contains "[LON]" or the addressee acknowledges
> > > > the receipt within 48 hours then I don't resend the email.
> > > > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> > > > time(x) < Now + 48h) ⇒ ¬resend(I, this).
> > > > 
> > > > If an email is sent by a sender that is not a trusted contact or the
> > > > email does not contain a valid code then the email is not received. A
> > > > valid code starts with a hyphen and ends with "X".
> > > > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧
> > > > y
> > 
> > ∈
> > 
> > > > L(-[a-z]+[0-9]X)).
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to