Hello, I have Nutch setup on a Linux server with Solr running on the same server. When i try to crawl some websites i get a job fail at the indexing part of the crawl:
2012-01-30 10:11:29,445 INFO solr.SolrWriter - Adding 1 documents 2012-01-30 10:11:29,698 WARN mapred.LocalJobRunner - job_local_0009 org.apache.solr.common.SolrException: Not Found Not Found request: http://127.0.0.1:8080/solr_3-5/searchdkdde_en/update?wt=javabin&version=2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:93) at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 2012-01-30 10:11:30,362 ERROR solr.SolrIndexer - java.io.IOException: Job failed! Now, when i call the 'solrindex' command with the freshly created crawl Folders as arguments, the document gets indexed. So - what is the difference between the 'solrindex' command and the indexing part of the 'crawl' command? What could be the reason why the latter doesn't work? My solution would be to write a shell script that simulates the 'crawl' command by calling the single inject/fetch/index/... commands. -- [Entwickler] dkd Internet Service GmbH development // kommunikation // design Kaiserstraße 73 60329 Frankfurt/Main fon: +49 69 2475218-0 fax: +49 69 2475218-99 e-mail: [email protected] twitter: http://twitter.com/dkd_de facebook: http://www.facebook.com/www.dkd.de web: http://www.dkd.de Registergericht: Amtsgericht Frankfurt am Main Registernummer: HRB 45590 Geschäftsführer: Olivier Dobberkau, Søren Schaffstein, Götz Wegenast, Christian Zabanski Aktuelle Projekte: http://www.spielwarenmesse-eg.de – Relaunch & Responsive Design (TYPO3) http://www.horsch.com – Relaunch Website (TYPO3) http://www.dosb.de – Refresh Website (TYPO3)

