Can anyone help me here? Or, am I asking in the wrong place?
On Apr 14, 2011, at 9:57 PM, Alex wrote:
Hi,
I am new to Nutch. I have an application that uses Nutch to
search. I have configured the application so that Nutch can run.
However, after a lot of troubleshooting I have been pointed to the
fact that there is something wrong with my hosts file. My hostname
is different than my domain name and that "seems" to make Nutch stop
in depth 1. Does anyone have any idea of what is the correct
configuration of the hosts file so that nutch runs properly?
My domain name resolves fine. Please help me!
Here are the logs of the indexing:
Stopping at depth=1 - no more URLs to fetch.
INFO sitesearch.CrawlerUtil: indexHost : Starting an Site Search
index on host www.mydomain.com
INFO sitesearch.CrawlerUtil: site search crawl started in: /path/to/
search_index/www.mydomain.com/1-XXX_temp/crawl-index
] INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/
search_index/www.mydomain.com/url_folder
INFO sitesearch.CrawlerUtil: threads = 10
INFO sitesearch.CrawlerUtil: depth = 20
INFO sitesearch.CrawlerUtil: indexer=lucene
INFO sitesearch.CrawlerUtil: Stopping at depth=1 - no more URLs to
fetch.
NFO sitesearch.CrawlerUtil: site search crawl finished: /
directorypath/search_index/www.mydomain.com/1xxx/crawl-index
INFO sitesearch.CrawlerUtil: indexHost : Finished Site Search index
on host www.mydomain.com