RE: Index with Solr to my own webapp

McGibbney, Lewis John Thu, 10 Feb 2011 05:32:06 -0800

Hi Markus

Ok first is first, here is Hadoop.log


2011-02-09 23:24:11,826 INFO  indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2011-02-09 23:24:11,828 INFO  indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2011-02-09 23:24:11,875 INFO  solr.SolrMappingReader - source: content dest: 
content
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: site dest: site
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: title dest: title
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: host dest: host
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: segment dest: 
segment
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: boost dest: boost
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: digest dest: 
digest
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: tstamp dest: 
tstamp
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: url dest: id
2011-02-09 23:24:11,876 INFO  solr.SolrMappingReader - source: url dest: url
2011-02-09 23:24:13,626 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Not Found

Not Found

request: http://127.0.0.1:8080/solr/update?wt=javabin&version=1
        at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
        at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
        at org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:64)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:54)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:44)
        at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:440)
        at 
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:159)
        at 
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2011-02-09 23:24:14,128 ERROR solr.SolrIndexer - java.io.IOException: Job 
failed!

I am unsure of where to get Solr output as I have been unable to progress past 
the stage above. I have been indexing directly from Nutch to vanilla Solr 1.4.1 
dist, but this is my first attempt at indexing to my own app. Within my web app 
I have added following dirs:

bin (empty)
conf (usual nutch schema, solrconfig with Nutch requestHandler, scripts, 
synonyms, etc)
data (index and spellchecker dirs! Each containing segments.gen and segments_1)
dist (as per 1.4.1 solr version)
lib (as above)

I hope that this is sufficient

Lewis
________________________________________
From: Markus Jelsma [[email protected]]
Sent: 10 February 2011 10:58
To: [email protected]
Cc: McGibbney, Lewis John
Subject: Re: Index with Solr to my own webapp

Yes, please show us the hadoop.log output and the Solr output. The latter is
in this stage usually more important. You might write to not-existing fields or
writing multiple values to a single valued field or... whatever's happening.

On Thursday 10 February 2011 00:36:21 McGibbney, Lewis John wrote:
> Hi list,
>
> I am at Solr indexing stage and seem to have hit trouble when sending
> crawldb linkdb and segments/* to Solr to be indexed. I have added xml file
> to $CATALINA_HOME/cong/catalina/localhost with my webapp specifics. My
> Solr 1.4.1 implementation resides within my web app at following location
> /home/lewis/Downloads/mywebapp but when I send this command to index with
> Solr
>
> lewis@lewis-01:~/Downloads/nutch-1.2$ bin/nutch solrindex
> http://127.0.0.1:8080/mywebapp crawl/crawldb crawl/linkdb crawl/segments/*
>
> I am getting java.io.IOException: Job failed!
>
> I had experienced this before when I was using the Solrindex command
> incorrectly, I am hoping that this is not the case, however, it is late
> and I might have missed something simple.
>
> I have Hadoop.log if this would help at all.
>
> Any suggestions please. Thanks
>
> Lewis
>
> Glasgow Caledonian University is a registered Scottish charity, number
> SC021474
>
> Winner: Times Higher Education’s Widening Participation Initiative of the
> Year 2009 and Herald Society’s Education Initiative of the Year 2009.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,
> en.html
>
> Winner: Times Higher Education’s Outstanding Support for Early Career
> Researchers of the Year 2010, GCU as a lead with Universities Scotland
> partners.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691
> ,en.html

--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

RE: Index with Solr to my own webapp

Reply via email to