The logs say this:
>> Generator: 0 records selected for fetching, exiting ...
This is because there are no urls that generator could pass to form a
segment.

>> Injector: total number of urls injected after normalization and
filtering: 0
Inject did NOT add anything to the crawldb. Check if you are over-filtering
the input urls. Also it would be nice to see the urls that you are
injecting are valid. From the logs looks like there were just 4 urls in the
seeds file.

Thanks,
Tejas


On Fri, Feb 14, 2014 at 4:43 PM, Bayu Widyasanyata
<[email protected]>wrote:

> Hi,
>
> From what I know that "nutch generate" will create a new segment directory
> every round nutch is running.
>
> I have a problem (never happened before) that nutch won't create new
> segment.
> It always only fetch and parse the latest segment.
> - from the logs:
> 2014-02-15 07:20:02,036 INFO  fetcher.Fetcher - Fetcher: segment:
> /opt/searchengine/nutch/BappenasCrawl/segments/20140205213835
>
> Even though I repeat the processes (generate > fetch > parse > update)
> many times.
>
> What should I check for the configuration of nutch? or any hints to solve
> this problem.
>
> I use nutch 1.7.
>
> And here is the part of hadoop log file:
> http://pastebin.com/kpi48gK6
>
> Thank you.
>
> --
> wassalam,
> [bayu]
>

Reply via email to