Hi Markus Ok first is first, here is Hadoop.log
2011-02-09 23:24:11,826 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2011-02-09 23:24:11,828 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2011-02-09 23:24:11,875 INFO solr.SolrMappingReader - source: content dest: content 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: site dest: site 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: title dest: title 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: host dest: host 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: segment dest: segment 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: boost dest: boost 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: digest dest: digest 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: tstamp dest: tstamp 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: url dest: id 2011-02-09 23:24:11,876 INFO solr.SolrMappingReader - source: url dest: url 2011-02-09 23:24:13,626 WARN mapred.LocalJobRunner - job_local_0001 org.apache.solr.common.SolrException: Not Found Not Found request: http://127.0.0.1:8080/solr/update?wt=javabin&version=1 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:64) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:54) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:44) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:440) at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:159) at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 2011-02-09 23:24:14,128 ERROR solr.SolrIndexer - java.io.IOException: Job failed! I am unsure of where to get Solr output as I have been unable to progress past the stage above. I have been indexing directly from Nutch to vanilla Solr 1.4.1 dist, but this is my first attempt at indexing to my own app. Within my web app I have added following dirs: bin (empty) conf (usual nutch schema, solrconfig with Nutch requestHandler, scripts, synonyms, etc) data (index and spellchecker dirs! Each containing segments.gen and segments_1) dist (as per 1.4.1 solr version) lib (as above) I hope that this is sufficient Lewis ________________________________________ From: Markus Jelsma [[email protected]] Sent: 10 February 2011 10:58 To: [email protected] Cc: McGibbney, Lewis John Subject: Re: Index with Solr to my own webapp Yes, please show us the hadoop.log output and the Solr output. The latter is in this stage usually more important. You might write to not-existing fields or writing multiple values to a single valued field or... whatever's happening. On Thursday 10 February 2011 00:36:21 McGibbney, Lewis John wrote: > Hi list, > > I am at Solr indexing stage and seem to have hit trouble when sending > crawldb linkdb and segments/* to Solr to be indexed. I have added xml file > to $CATALINA_HOME/cong/catalina/localhost with my webapp specifics. My > Solr 1.4.1 implementation resides within my web app at following location > /home/lewis/Downloads/mywebapp but when I send this command to index with > Solr > > lewis@lewis-01:~/Downloads/nutch-1.2$ bin/nutch solrindex > http://127.0.0.1:8080/mywebapp crawl/crawldb crawl/linkdb crawl/segments/* > > I am getting java.io.IOException: Job failed! > > I had experienced this before when I was using the Solrindex command > incorrectly, I am hoping that this is not the case, however, it is late > and I might have missed something simple. > > I have Hadoop.log if this would help at all. > > Any suggestions please. Thanks > > Lewis > > Glasgow Caledonian University is a registered Scottish charity, number > SC021474 > > Winner: Times Higher Education’s Widening Participation Initiative of the > Year 2009 and Herald Society’s Education Initiative of the Year 2009. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219, > en.html > > Winner: Times Higher Education’s Outstanding Support for Early Career > Researchers of the Year 2010, GCU as a lead with Universities Scotland > partners. > http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691 > ,en.html -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350 Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

