Hi all. I have a problem when i try to do a big crawl process, specifically when the topN paremeter is bigger than 1000. Im using nutch 1.4 and solr 3.4 in a pc with this features Intel CoreI3,Ram 2GB, HD 160 GB.
the problem is an exception(java.io.IOException: Job failed! ) in the moment to add documents in solr index, but I dont know how to fix this, I have reduced the solr.commit.size from 1000 to 250, but the problems still happening, please any idea or recomendation or way to solve will be appreciated this is a part of my hadoop.log file 2012-09-20 09:46:57,148 INFO solr.SolrMappingReader - source: language dest: language 2012-09-20 09:46:57,148 INFO solr.SolrMappingReader - source: url dest: url 2012-09-20 09:46:57,704 INFO solr.SolrWriter - Adding 250 documents 2012-09-20 09:46:59,974 INFO solr.SolrWriter - Adding 250 documents 2012-09-20 09:47:01,578 INFO solr.SolrWriter - Adding 250 documents 2012-09-20 09:47:02,137 INFO solr.SolrWriter - Adding 250 documents 2012-09-20 09:47:02,816 INFO solr.SolrWriter - Adding 250 documents 2012-09-20 09:47:03,272 WARN mapred.LocalJobRunner - job_local_0030 org.apache.solr.common.SolrException: Petición incorrecta Petición incorrecta request: http://localhost:8080/solr/update?wt=javabin&version=2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) at org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:81) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:54) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:44) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:440) at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:166) at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:51) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 2012-09-20 09:47:03,447 ERROR solr.SolrIndexer - java.io.IOException: Job failed! 2012-09-20 09:47:03,448 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2012-09-20 09:47:03 2012-09-20 09:47:03,448 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://localhost:8080/solr 2012-09-20 09:47:05,039 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: finished at 2012-09-20 09:47:05, elapsed: 00:00:01 2012-09-20 09:47:05,040 INFO crawl.Crawl - crawl finished: crawl 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci

