i can't solve it now

jibjoice wrote:
> 
> i follow this link "http://wiki.apache.org/nutch/NutchHadoopTutorial"; so i
> think it's not about the conf/crawl-urlfilter.txt file when i use this
> command "bin/nutch crawl urls -dir crawled -depth 3" again it shows :
> Generator: Selecting best-scoring urls due for fetch.
> Generator: starting
> Generator: segment: crawled/segments/25501221110712
> Generator: filtering: false
> Generator: topN: 2147483647
> Generator: Partitioning selected urls by host, for politeness.
> Generator: done.
> Fetcher: starting
> Fetcher: segment: crawled/segments/25501221110712
> Fetcher: done
> CrawlDb update: starting
> CrawlDb update: db: crawled/crawldb
> CrawlDb update: segments: [crawled/segments/25501221110712]
> CrawlDb update: additions allowed: true
> CrawlDb update: URL normalizing: true
> CrawlDb update: URL filtering: true
> CrawlDb update: Merging segment data into db.
> CrawlDb update: done
> Generator: Selecting best-scoring urls due for fetch.
> Generator: starting
> Generator: segment: crawled/segments/25501221110908
> Generator: filtering: false
> Generator: topN: 2147483647
> Generator: Partitioning selected urls by host, for politeness.
> Generator: done.
> Fetcher: starting
> Fetcher: segment: crawled/segments/25501221110908
> Fetcher: done
> CrawlDb update: starting
> CrawlDb update: db: crawled/crawldb
> CrawlDb update: segments: [crawled/segments/25501221110908]
> CrawlDb update: additions allowed: true
> CrawlDb update: URL normalizing: true
> CrawlDb update: URL filtering: true
> CrawlDb update: Merging segment data into db.
> CrawlDb update: done
> LinkDb: starting
> LinkDb: linkdb: crawled/linkdb
> LinkDb: URL normalize: true
> LinkDb: URL filter: true
> LinkDb: adding segment: /user/nutch/crawled/segments/25501221110519
> LinkDb: adding segment: /user/nutch/crawled/segments/25501221110712
> LinkDb: adding segment: /user/nutch/crawled/segments/25501221110908
> LinkDb: done
> Indexer: starting
> Indexer: linkdb: crawled/linkdb
> Indexer: adding segment: /user/nutch/crawled/segments/25501221110519
> Indexer: adding segment: /user/nutch/crawled/segments/25501221110712
> Indexer: adding segment: /user/nutch/crawled/segments/25501221110908
> Indexer: done
> Dedup: starting
> Dedup: adding indexes in: crawled/indexes
> task_0017_m_000000_0: log4j:ERROR Either File or DatePattern options are
> not set for appender [DRFA].
> task_0017_m_000001_0: log4j:ERROR setFile(null,true) call failed.
> task_0017_m_000001_0: java.io.FileNotFoundException: /nutch/search/logs
> (Is a directory)
> task_0017_m_000001_0:   at java.io.FileOutputStream.openAppend(Native
> Method)
> task_0017_m_000001_0:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:177)
> task_0017_m_000001_0:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:102)
> task_0017_m_000001_0:   at
> org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
> task_0017_m_000001_0:   at
> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
> task_0017_m_000001_0:   at
> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
> task_0017_m_000001_0:   at
> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
> task_0017_m_000001_0:   at
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
> task_0017_m_000001_0:   at
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
> task_0017_m_000001_0:   at
> org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
> task_0017_m_000001_0:   at
> org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
> task_0017_m_000001_0:   at
> org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:509)
> task_0017_m_000001_0:   at
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
> task_0017_m_000001_0:   at
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
> task_0017_m_000001_0:   at
> org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:468)
> task_0017_m_000001_0:   at
> org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
> task_0017_m_000001_0:   at
> org.apache.log4j.Logger.getLogger(Logger.java:104)
> task_0017_m_000001_0:   at
> org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
> task_0017_m_000001_0:   at
> org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
> task_0017_m_000001_0:   at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> task_0017_m_000001_0:   at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> task_0017_m_000001_0:   at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> task_0017_m_000001_0:   at
> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> task_0017_m_000001_0:   at
> org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
> task_0017_m_000001_0:   at
> org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
> task_0017_m_000001_0:   at
> org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
> task_0017_m_000001_0:   at
> org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:82)
> task_0017_m_000001_0:   at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1423)
> task_0017_m_000001_0: log4j:ERROR Either File or DatePattern options are
> not set for appender [DRFA].
> task_0017_m_000000_1: log4j:ERROR setFile(null,true) call failed.
> task_0017_m_000000_1: java.io.FileNotFoundException: /nutch/search/logs
> (Is a directory)
> task_0017_m_000000_1:   at java.io.FileOutputStream.openAppend(Native
> Method)
> task_0017_m_000000_1:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:177)
> task_0017_m_000000_1:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:102)
> task_0017_m_000000_1:   at
> org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
> task_0017_m_000000_1:   at
> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
> task_0017_m_000000_1:   at
> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
> task_0017_m_000000_1:   at
> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
> task_0017_m_000000_1:   at
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
> task_0017_m_000000_1:   at
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
> task_0017_m_000000_1:   at
> org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
> task_0017_m_000000_1:   at
> org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
> task_0017_m_000000_1:   at
> org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:509)
> task_0017_m_000000_1:   at
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
> task_0017_m_000000_1:   at
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
> task_0017_m_000000_1:   at
> org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:468)
> task_0017_m_000000_1:   at
> org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
> task_0017_m_000000_1:   at
> org.apache.log4j.Logger.getLogger(Logger.java:104)
> task_0017_m_000000_1:   at
> org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
> task_0017_m_000000_1:   at
> org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
> task_0017_m_000000_1:   at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> task_0017_m_000000_1:   at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> task_0017_m_000000_1:   at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> task_0017_m_000000_1:   at
> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> task_0017_m_000000_1:   at
> org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
> task_0017_m_000000_1:   at
> org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
> task_0017_m_000000_3:   at
> org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
> task_0017_m_000000_3:   at
> org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:82)
> task_0017_m_000000_3:   at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1423)
> task_0017_m_000000_3: log4j:ERROR Either File or DatePattern options are
> not set for appender [DRFA].
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>         at
> org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439)
>         at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> 
> i don't know what happen?
> 
> 
> pvvpr wrote:
>> 
>> I think you need to check the conf/crawl-urlfilter.txt file
>> 
>> On Thursday 20 December 2007 04:55, jibjoice wrote:
>>> please, help me to solve it
>>>
>>> jibjoice wrote:
>>> > where i should solve this? why it generated 0 records?
>>> >
>>> > pvvpr wrote:
>>> >> basically your indexes are empty since no URLs were generated and
>>> >> fetched. See
>>> >> this,
>>> >>
>>> >>> > - Generator: 0 records selected for fetching, exiting ...
>>> >>> > - Stopping at depth=0 - no more URLs to fetch.
>>> >>> > - No URLs to fetch - check your seed list and URL filters.
>>> >>> > - crawl finished: crawled
>>> >>
>>> >> when no pages are indexed, dedup throws Exception
>>> >>
>>> >> On Tuesday 18 December 2007 21:33, jibjoice wrote:
>>> >>> i can't solve it now, pls help me
>>> >>>
>>> >>> jibjoice wrote:
>>> >>> > i use nutch-0.9, hadoop-0.12.2 and i use this command "bin/nutch
>>> >>> > crawl urls -dir crawled -depth 3" have error :
>>> >>> >
>>> >>> > - crawl started in: crawled
>>> >>> > - rootUrlDir = input
>>> >>> > - threads = 10
>>> >>> > - depth = 3
>>> >>> > - Injector: starting
>>> >>> > - Injector: crawlDb: crawled/crawldb
>>> >>> > - Injector: urlDir: input
>>> >>> > - Injector: Converting injected urls to crawl db entries.
>>> >>> > - Total input paths to process : 1
>>> >>> > - Running job: job_0001
>>> >>> > - map 0% reduce 0%
>>> >>> > - map 100% reduce 0%
>>> >>> > - map 100% reduce 100%
>>> >>> > - Job complete: job_0001
>>> >>> > - Counters: 6
>>> >>> > - Map-Reduce Framework
>>> >>> > - Map input records=3
>>> >>> > - Map output records=1
>>> >>> > - Map input bytes=22
>>> >>> > - Map output bytes=52
>>> >>> > - Reduce input records=1
>>> >>> > - Reduce output records=1
>>> >>> > - Injector: Merging injected urls into crawl db.
>>> >>> > - Total input paths to process : 2
>>> >>> > - Running job: job_0002
>>> >>> > - map 0% reduce 0%
>>> >>> > - map 100% reduce 0%
>>> >>> > - map 100% reduce 58%
>>> >>> > - map 100% reduce 100%
>>> >>> > - Job complete: job_0002
>>> >>> > - Counters: 6
>>> >>> > - Map-Reduce Framework
>>> >>> > - Map input records=3
>>> >>> > - Map output records=1
>>> >>> > - Map input bytes=60
>>> >>> > - Map output bytes=52
>>> >>> > - Reduce input records=1
>>> >>> > - Reduce output records=1
>>> >>> > - Injector: done
>>> >>> > - Generator: Selecting best-scoring urls due for fetch.
>>> >>> > - Generator: starting
>>> >>> > - Generator: segment: crawled/segments/25501213164325
>>> >>> > - Generator: filtering: false
>>> >>> > - Generator: topN: 2147483647
>>> >>> > - Total input paths to process : 2
>>> >>> > - Running job: job_0003
>>> >>> > - map 0% reduce 0%
>>> >>> > - map 100% reduce 0%
>>> >>> > - map 100% reduce 100%
>>> >>> > - Job complete: job_0003
>>> >>> > - Counters: 6
>>> >>> > - Map-Reduce Framework
>>> >>> > - Map input records=3
>>> >>> > - Map output records=1
>>> >>> > - Map input bytes=59
>>> >>> > - Map output bytes=77
>>> >>> > - Reduce input records=1
>>> >>> > - Reduce output records=1
>>> >>> > - Generator: 0 records selected for fetching, exiting ...
>>> >>> > - Stopping at depth=0 - no more URLs to fetch.
>>> >>> > - No URLs to fetch - check your seed list and URL filters.
>>> >>> > - crawl finished: crawled
>>> >>> >
>>> >>> > but sometime i crawl some url it has error indexes time that
>>> >>> >
>>> >>> > - Indexer: done
>>> >>> > - Dedup: starting
>>> >>> > - Dedup: adding indexes in: crawled/indexes
>>> >>> > - Total input paths to process : 2
>>> >>> > - Running job: job_0025
>>> >>> > - map 0% reduce 0%
>>> >>> > - Task Id : task_0025_m_000001_0, Status : FAILED
>>> >>> > task_0025_m_000001_0: - Error running child
>>> >>> > task_0025_m_000001_0: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000001_0: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000001_0: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000001_0: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000001_0: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000001_0: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000001_0: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - Task Id : task_0025_m_000000_0, Status : FAILED
>>> >>> > task_0025_m_000000_0: - Error running child
>>> >>> > task_0025_m_000000_0: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000000_0: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000000_0: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000000_0: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000000_0: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000000_0: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000000_0: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - Task Id : task_0025_m_000000_1, Status : FAILED
>>> >>> > task_0025_m_000000_1: - Error running child
>>> >>> > task_0025_m_000000_1: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000000_1: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000000_1: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000000_1: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000000_1: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000000_1: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000000_1: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - Task Id : task_0025_m_000001_1, Status : FAILED
>>> >>> > task_0025_m_000001_1: - Error running child
>>> >>> > task_0025_m_000001_1: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000001_1: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000001_1: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000001_1: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000001_1: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000001_1: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000001_1: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - Task Id : task_0025_m_000001_2, Status : FAILED
>>> >>> > task_0025_m_000001_2: - Error running child
>>> >>> > task_0025_m_000001_2: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000001_2: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000001_2: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000001_2: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000001_2: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000001_2: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000001_2: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - Task Id : task_0025_m_000000_2, Status : FAILED
>>> >>> > task_0025_m_000000_2: - Error running child
>>> >>> > task_0025_m_000000_2: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000000_2: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000000_2: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000000_2: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000000_2: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000000_2: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000000_2: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - map 100% reduce 100%
>>> >>> > - Task Id : task_0025_m_000001_3, Status : FAILED
>>> >>> > task_0025_m_000001_3: - Error running child
>>> >>> > task_0025_m_000001_3: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000001_3: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000001_3: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000001_3: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000001_3: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000001_3: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000001_3: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > - Task Id : task_0025_m_000000_3, Status : FAILED
>>> >>> > task_0025_m_000000_3: - Error running child
>>> >>> > task_0025_m_000000_3: java.lang.ArrayIndexOutOfBoundsException: -1
>>> >>> > task_0025_m_000000_3: at
>>> >>> >
>>> org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
>>> >>> > task_0025_m_000000_3: at
>>> >>> >
>>> org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReade
>>> >>> > r.next(DeleteDuplicates.java:176)
>>> >>> > task_0025_m_000000_3: at
>>> >>> > org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
>>> >>> > task_0025_m_000000_3: at
>>> >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>>> >>> > task_0025_m_000000_3: at org.apache.hadoop.mapred.MapTask.run
>>> >>> > (MapTask.java:175)
>>> >>> > task_0025_m_000000_3: at
>>> >>> > org.apache.hadoop.mapred.TaskTracker$Child.main
>>> >>> > (TaskTracker.java:1445)
>>> >>> > Exception in thread "main" java.io.IOException: Job failed!
>>> >>> > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>>> >>> > at org.apache.nutch.indexer.DeleteDuplicates.dedup
>>> >>> > (DeleteDuplicates.java:439)
>>> >>> > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
>>> >>> >
>>> >>> > how i solve it?
>> 
>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Nutch-crawl-problem-tp14327978p14492766.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Reply via email to