>From your logs:

INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/


Looks like you didn't set the seed urls directory.  If that's not enough
info for you to fix it, send the full command you're running.

-Mark



On Thu, Apr 14, 2011 at 10:57 PM, Alex <[email protected]> wrote:

> Hi,
>
> I am new to Nutch.  I have an application that uses Nutch to search.
> I have configured the application so that Nutch can run.  However,
> after a lot of troubleshooting I have been pointed to the fact that
> there is something wrong with my hosts file.  My hostname is different
> than my domain name and that "seems" to make Nutch stop in depth 1.
> Does anyone have any idea of what is the correct configuration of the
> hosts file so that nutch runs properly?
>
> My domain name resolves fine.  Please help me!
>
> Here are the logs of the indexing:
>
> Stopping at depth=1 - no more URLs to fetch.
>
>  INFO sitesearch.CrawlerUtil: indexHost : Starting an Site Search
> index on host www.mydomain.com
> INFO sitesearch.CrawlerUtil: site search crawl started in: /opt/dotcms/
> dotCMS/assets/search_index/www.mydomain.com/1-XXX_temp/crawl-index
> ] INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/
> search_index/www.mydomain.com/url_folder
> INFO sitesearch.CrawlerUtil: threads = 10
>  INFO sitesearch.CrawlerUtil: depth = 20
> INFO sitesearch.CrawlerUtil: indexer=lucene
>
> INFO sitesearch.CrawlerUtil: Stopping at depth=1 - no more URLs to
> fetch.
> NFO sitesearch.CrawlerUtil: site search crawl finished: /directorypath/
> search_index/www.mydomain.com/1xxx/crawl-index
> INFO sitesearch.CrawlerUtil: indexHost : Finished Site Search index on
> host www.mydomain.com
>

Reply via email to