>From your logs: INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/
Looks like you didn't set the seed urls directory. If that's not enough info for you to fix it, send the full command you're running. -Mark On Thu, Apr 14, 2011 at 10:57 PM, Alex <[email protected]> wrote: > Hi, > > I am new to Nutch. I have an application that uses Nutch to search. > I have configured the application so that Nutch can run. However, > after a lot of troubleshooting I have been pointed to the fact that > there is something wrong with my hosts file. My hostname is different > than my domain name and that "seems" to make Nutch stop in depth 1. > Does anyone have any idea of what is the correct configuration of the > hosts file so that nutch runs properly? > > My domain name resolves fine. Please help me! > > Here are the logs of the indexing: > > Stopping at depth=1 - no more URLs to fetch. > > INFO sitesearch.CrawlerUtil: indexHost : Starting an Site Search > index on host www.mydomain.com > INFO sitesearch.CrawlerUtil: site search crawl started in: /opt/dotcms/ > dotCMS/assets/search_index/www.mydomain.com/1-XXX_temp/crawl-index > ] INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/ > search_index/www.mydomain.com/url_folder > INFO sitesearch.CrawlerUtil: threads = 10 > INFO sitesearch.CrawlerUtil: depth = 20 > INFO sitesearch.CrawlerUtil: indexer=lucene > > INFO sitesearch.CrawlerUtil: Stopping at depth=1 - no more URLs to > fetch. > NFO sitesearch.CrawlerUtil: site search crawl finished: /directorypath/ > search_index/www.mydomain.com/1xxx/crawl-index > INFO sitesearch.CrawlerUtil: indexHost : Finished Site Search index on > host www.mydomain.com >

