It is for my final report 
french student I am trying to do integration solr (3.6.1) and
apache-nutch-1.5-src.zip  O.S (ubuntu 11.04)  java (1.7)
Every things is fine but when i try to do this command bin/nutch solrindex
http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb crawl/linkdb
crawl/segments/*  I get this error   java.io.IOException: Job failed!
I went to solr/dist and I copied apache-solr-core-3.6.1.jar,
apache-solr-solrj-3.6.1.jar and solrj-lib to nutch/lib and remove 
solr-solrj-3.6.1.jar (inside nutch )  after I built nutch but I still get
the same error.
this is my hadoop.log
2012-11-23 12:48:37,654 INFO  solr.SolrIndexer - SolrIndexer: starting at
2012-11-23 12:48:37
2012-11-23 12:48:37,761 INFO  indexer.IndexerMapReduce - IndexerMapReduce:
crawldb: crawl/crawldb
2012-11-23 12:48:37,761 INFO  indexer.IndexerMapReduce - IndexerMapReduce:
linkdb: crawl/linkdb
2012-11-23 12:48:37,761 INFO  indexer.IndexerMapReduce - IndexerMapReduces:
adding segment: crawl/segments/20121121133422
2012-11-23 12:48:37,975 INFO  indexer.IndexerMapReduce - IndexerMapReduces:
adding segment: crawl/segments/20121121134004
2012-11-23 12:48:37,979 INFO  indexer.IndexerMapReduce - IndexerMapReduces:
adding segment: crawl/segments/20121121134920
2012-11-23 12:48:38,129 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2012-11-23 12:48:39,084 INFO  plugin.PluginRepository - Plugins: looking in:
/usr/local/test/nutch/runtime/local/plugins
2012-11-23 12:48:39,600 INFO  plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2012-11-23 12:48:39,600 INFO  plugin.PluginRepository - Registered Plugins:
2012-11-23 12:48:39,600 INFO  plugin.PluginRepository -         the nutch core
extension points (nutch-extensionpoints)
2012-11-23 12:48:39,600 INFO  plugin.PluginRepository -         CyberNeko HTML
Parser (lib-nekohtml)
2012-11-23 12:48:39,600 INFO  plugin.PluginRepository -         OPIC Scoring
Plug-in (scoring-opic)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Basic Indexing
Filter (index-basic)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Html Parse 
Plug-in
(parse-html)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Anchor Indexing
Filter (index-anchor)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         HTTP Framework
(lib-http)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Regex URL Filter
(urlfilter-regex)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Regex URL Filter
Framework (lib-regex-filter)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Http Protocol
Plug-in (protocol-http)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository - Registered
Extension-Points:
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Nutch URL
Normalizer (org.apache.nutch.net.URLNormalizer)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Nutch Protocol
(org.apache.nutch.protocol.Protocol)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Nutch Segment 
Merge
Filter (org.apache.nutch.segment.SegmentMergeFilter)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Nutch URL Filter
(org.apache.nutch.net.URLFilter)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         Nutch Indexing
Filter (org.apache.nutch.indexer.IndexingFilter)
2012-11-23 12:48:39,601 INFO  plugin.PluginRepository -         HTML Parse 
Filter
(org.apache.nutch.parse.HtmlParseFilter)
2012-11-23 12:48:39,602 INFO  plugin.PluginRepository -         Nutch Content
Parser (org.apache.nutch.parse.Parser)
2012-11-23 12:48:39,602 INFO  plugin.PluginRepository -         Nutch Scoring
(org.apache.nutch.scoring.ScoringFilter)
2012-11-23 12:48:39,606 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:39,608 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:39,608 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:48:41,868 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:41,868 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:41,868 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:48:45,040 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:45,040 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:45,041 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:48:48,013 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:48,015 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:48,015 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:48:51,086 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:51,086 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:51,086 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:48:54,067 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:54,067 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:54,067 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:48:56,940 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:48:56,940 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:48:56,940 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:00,049 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:00,049 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:00,049 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:03,053 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:03,053 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:03,053 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:06,148 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:06,149 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:06,149 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:15,901 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:15,962 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:15,962 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:17,605 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:17,647 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:17,647 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:20,551 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:20,551 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:20,552 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:24,150 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:24,150 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:24,151 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:28,574 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-11-23 12:49:28,574 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2012-11-23 12:49:28,574 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-11-23 12:49:30,123 INFO  solr.SolrMappingReader - source: content dest:
content
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: title dest:
title
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: host dest:
host
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: segment dest:
segment
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: boost dest:
boost
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: digest dest:
digest
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: tstamp dest:
tstamp
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: url dest: id
2012-11-23 12:49:30,124 INFO  solr.SolrMappingReader - source: url dest: url
2012-11-23 12:49:31,107 INFO  solr.SolrWriter - Indexing 52 documents
2012-11-23 12:49:53,007 ERROR solr.SolrIndexer - java.io.IOException: Job
failed!   .




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-time-URL-filtering-again-tp4021793p4022029.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to