Hi, all:
I am still getting this error. I installed two instances of the
application I am using. The only difference I have is the amount of
data. The one with less data crawls perfectly and the one with more
data does not. They all show the same log outputs at the end, but the
one with more data shows the "Stopping at depth=1 - no more URLs to
fetch"
Do I need to change the settings of Nutch for large sites?
Thanks,
Alex
------------
Here are the logs of the indexing:
Stopping at depth=1 - no more URLs to fetch.
INFO sitesearch.CrawlerUtil: indexHost : Starting an Site Search
index on host www.mydomain.com
INFO sitesearch.CrawlerUtil: site search crawl started in: /opt/
dotcms/
dotCMS/assets/search_index/www.mydomain.com/1-XXX_temp/crawl-index
] INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/
search_index/www.mydomain.com/url_folder
INFO sitesearch.CrawlerUtil: threads = 10
INFO sitesearch.CrawlerUtil: depth = 20
INFO sitesearch.CrawlerUtil: indexer=lucene
INFO sitesearch.CrawlerUtil: Stopping at depth=1 - no more URLs to
fetch.
NFO sitesearch.CrawlerUtil: site search crawl finished: /
directorypath/
search_index/www.mydomain.com/1xxx/crawl-index
INFO sitesearch.CrawlerUtil: indexHost : Finished Site Search
index
on
host www.mydomain.com