Hello Group,
I have already a crawler. Now I want to move to Nutch and I
just wanted to use Nutch for crawling sites and downloing to a local drive like
my current crawler does. I just dont want to use indexing feature of nutch as I
am using Lucene for that. I have lots of different URL patterns to be crawled
so I cannot use customized pattern to be specified in property file like nutch
requires. Can you please tell me whether I can use Nucth programmatically and I
mean use some api methods and crawl pages. I have just gone through nutch API
and I can see that Craw class have only one method i.e. main and it picks up
everything from property files. Maybe somebody can help me to use nutch API. I
would appreciate any example so that I can replace my own junk crawler by Nutch.
Your help would be appreciated.
- BR
---------------------------------
Be a better pen pal. Text or chat with friends inside Yahoo! Mail. See how.