I don't recall the exact command, but you can use the 'inject' command
to inject an url as a starting point.
Zhou LiBing wrote:
>hi
> I have a problem about the nutch crawler, How can I crawling the www
>according to one or serveral specified URL? becauseIdon't want to use
>the
>DMOZ data.
>
>
> On 5/3/05, Jason Manfield <[EMAIL PROTECTED]> wrote:
>
>
>>We would like to use nutch just for crawling, and then index the crawled
>>database into our proprietory datastore/index. How do we go about this? I
>>see that nutch is a shell script, so it is possible to just crawl. Once it
>>crawls, I suppose the crawled data is dumped into webdb. Are there exposed
>>APIs to extract the data from webdb?
>>
>>One more catch -- our company is a .NET shop :((, so we would like to use
>>C# to read the data of the fetched/crawled pages for further indexing.
>>
>>Ideas/suggestions?
>>
>>Any plans to have nutch for .NET (like dotLucene)?
>>
>>__________________________________________________
>>Do You Yahoo!?
>>Tired of spam? Yahoo! Mail has the best spam protection around
>>http://mail.yahoo.com
>>
>>
>>
>
>
>
>
>