It is for my final report french student I am trying to do integration solr (3.6.1) and apache-nutch-1.5-src.zip O.S (ubuntu 11.04) java (1.7) Every things is fine but when i try to do this command bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/* I get this error java.io.IOException: Job failed! I went to solr/dist and I copied apache-solr-core-3.6.1.jar, apache-solr-solrj-3.6.1.jar and solrj-lib to nutch/lib and remove solr-solrj-3.6.1.jar (inside nutch ) after I built nutch but I still get the same error. this is my hadoop.log 2012-11-23 12:48:37,654 INFO solr.SolrIndexer - SolrIndexer: starting at 2012-11-23 12:48:37 2012-11-23 12:48:37,761 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: crawl/crawldb 2012-11-23 12:48:37,761 INFO indexer.IndexerMapReduce - IndexerMapReduce: linkdb: crawl/linkdb 2012-11-23 12:48:37,761 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20121121133422 2012-11-23 12:48:37,975 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20121121134004 2012-11-23 12:48:37,979 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20121121134920 2012-11-23 12:48:38,129 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2012-11-23 12:48:39,084 INFO plugin.PluginRepository - Plugins: looking in: /usr/local/test/nutch/runtime/local/plugins 2012-11-23 12:48:39,600 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true] 2012-11-23 12:48:39,600 INFO plugin.PluginRepository - Registered Plugins: 2012-11-23 12:48:39,600 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints) 2012-11-23 12:48:39,600 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml) 2012-11-23 12:48:39,600 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - HTTP Framework (lib-http) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Registered Extension-Points: 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter) 2012-11-23 12:48:39,601 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) 2012-11-23 12:48:39,602 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser) 2012-11-23 12:48:39,602 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter) 2012-11-23 12:48:39,606 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:39,608 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:39,608 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:48:41,868 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:41,868 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:41,868 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:48:45,040 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:45,040 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:45,041 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:48:48,013 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:48,015 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:48,015 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:48:51,086 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:51,086 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:51,086 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:48:54,067 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:54,067 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:54,067 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:48:56,940 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:48:56,940 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:48:56,940 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:00,049 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:00,049 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:00,049 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:03,053 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:03,053 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:03,053 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:06,148 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:06,149 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:06,149 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:15,901 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:15,962 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:15,962 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:17,605 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:17,647 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:17,647 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:20,551 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:20,551 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:20,552 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:24,150 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:24,150 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:24,151 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:28,574 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2012-11-23 12:49:28,574 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off 2012-11-23 12:49:28,574 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter 2012-11-23 12:49:30,123 INFO solr.SolrMappingReader - source: content dest: content 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: title dest: title 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: host dest: host 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: segment dest: segment 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: boost dest: boost 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: digest dest: digest 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: tstamp dest: tstamp 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: url dest: id 2012-11-23 12:49:30,124 INFO solr.SolrMappingReader - source: url dest: url 2012-11-23 12:49:31,107 INFO solr.SolrWriter - Indexing 52 documents 2012-11-23 12:49:53,007 ERROR solr.SolrIndexer - java.io.IOException: Job failed! .
-- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-time-URL-filtering-again-tp4021793p4022029.html Sent from the Nutch - User mailing list archive at Nabble.com.

