Hi everybody,
I use nutch 1.5.1 / Solr 3.6.1.
For a few days when im try to crawl my page I'm getting this Warning.
I dont know what things have changed.
Output from hadoop.log
------------------------------------------------------------
2013-03-06 15:42:32,616 WARN mapred.FileOutputCommitter - Output path is
null in cleanup
2013-03-06 15:42:32,617 WARN mapred.LocalJobRunner - job_local_0010
java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:388)
at org.apache.hadoop.io.Text.set(Text.java:178)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
------------------------------------------------------------
Is there a posibility to get more Informationen in the hadoop logfile ?
------------------------------------------------------------
log4j.logger.org.apache.nutch.crawl.Crawl=ALL,cmdstdout
log4j.logger.org.apache.nutch.crawl.Injector=INFO,cmdstdout
log4j.logger.org.apache.nutch.crawl.Generator=INFO,cmdstdout
log4j.logger.org.apache.nutch.fetcher.Fetcher=INFO,cmdstdout
log4j.logger.org.apache.nutch.parse.ParseSegment=ALL,cmdstdout
log4j.logger.org.apache.nutch.crawl.CrawlDbReader=INFO,cmdstdout
log4j.logger.org.apache.nutch.crawl.CrawlDbMerger=INFO,cmdstdout
log4j.logger.org.apache.nutch.crawl.LinkDbReader=INFO,cmdstdout
log4j.logger.org.apache.nutch.segment.SegmentReader=INFO,cmdstdout
log4j.logger.org.apache.nutch.segment.SegmentMerger=INFO,cmdstdout
log4j.logger.org.apache.nutch.crawl.CrawlDb=ALL,cmdstdout
log4j.logger.org.apache.nutch.crawl.LinkDb=ALL,cmdstdout
log4j.logger.org.apache.nutch.crawl.LinkDbMerger=ALL,cmdstdout
log4j.logger.org.apache.nutch.indexer.solr.SolrIndexer=INFO,cmdstdout
log4j.logger.org.apache.nutch.indexer.solr.SolrWriter=INFO,cmdstdout
log4j.logger.org.apache.nutch.indexer.solr.SolrDeleteDuplicates=INFO,cmdstdout
log4j.logger.org.apache.nutch.indexer.solr.SolrClean=INFO,cmdstdout
log4j.logger.org.apache.nutch.scoring.webgraph.WebGraph=INFO,cmdstdout
log4j.logger.org.apache.nutch.scoring.webgraph.LinkRank=INFO,cmdstdout
log4j.logger.org.apache.nutch.scoring.webgraph.Loops=INFO,cmdstdout
log4j.logger.org.apache.nutch.scoring.webgraph.ScoreUpdater=INFO,cmdstdout
log4j.logger.org.apache.nutch.parse.ParserChecker=ALL,cmdstdout
log4j.logger.org.apache.nutch.indexer.IndexingFiltersChecker=INFO,cmdstdout
log4j.logger.org.apache.nutch.tools.FreeGenerator=INFO,cmdstdout
log4j.logger.org.apache.nutch.util.domain.DomainStatistics=INFO,cmdstdout
log4j.logger.org.apache.nutch.tools.CrawlDBScanner=INFO,cmdstdout
log4j.logger.org.apache.nutch.parse.ParserJob=ALL,cmdstdout
log4j.logger.org.apache.nutch.parse.ParserUtil=ALL,cmdstdout
log4j.logger.org.apache.hadoop.mapred.FileOutputCommitter=ALL,cmdstdout
log4j.logger.org.apache.hadoop.mapred.JobClient=ALL,cmdstdout
log4j.logger.org.apache.hadoop.mapred.LocalJobRunner=ALL,cmdstdout
log4j.logger.org.apache.nutch=ALL
log4j.logger.org.apache.hadoop=ALL
------------------------------------------------------------
I would appreciate an answer.
Thanks
Marcel.
--
View this message in context:
http://lucene.472066.n3.nabble.com/mapred-FileOutputCommitter-Output-path-is-null-in-cleanup-tp4045238.html
Sent from the Nutch - User mailing list archive at Nabble.com.