Hi,

I have a new problem.

When I run recrawling script it work fine first time. 

When I tried again it fails.

see attached log file for error.
http://www.nabble.com/file/p19493344/hadoop.log hadoop.log 

Plz. give me a solution.

Thanks in advance.

Regads,
Chetan Patel



Chetan Patel wrote:
> 
> Hi Doğacan Güney,
> 
> Thanks for giving solution.
> 
> Is it possible to recrawl without removing files?
> 
> Thank you again.
> 
> Regads,
> Chetan Patel
> 
> 
> Doğacan Güney-3 wrote:
>> 
>> Hi,
>> 
>> On Mon, Sep 15, 2008 at 2:43 PM, Chetan Patel <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> Hi,
>>>
>>> I have tried to re crawl script which is available on
>>> http://wiki.apache.org/nutch/IntranetRecrawl.
>>>
>>> I have got following error.
>>>
>>> 2008-09-15 17:04:32,238 INFO  fetcher.Fetcher - Fetcher: starting
>>> 2008-09-15 17:04:32,254 INFO  fetcher.Fetcher - Fetcher: segment:
>>> google/segments/20080915170335
>>> 2008-09-15 17:04:32,972 FATAL fetcher.Fetcher - Fetcher:
>>> java.io.IOException: Segment already fetched!
>>>        at
>>> org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputFormat.java:46)
>>>        at
>>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:329)
>>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
>>>        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:470)
>>>        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:505)
>>>        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
>>>        at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:477)
>>>
>>> 2008-09-15 17:04:35,144 INFO  crawl.CrawlDb - CrawlDb update: starting
>>>
>>> Plz. help me to solve this error.
>>>
>> 
>> Segment you are trying to crawl is already fetched. Try removing
>> everything but crawl_generate under that segment.
>> 
>>> Thanks in advance
>>>
>>> Regards,
>>> Chetan Patel
>>>
>>>
>>>
>> 
>> 
>> 
>> 
>> -- 
>> Doğacan Güney
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/hadoop-dfs--ls-and-nutch-generate-fetch-commands-tp16758617p19493344.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to