Hello,

Sorry for probable double posting (the message I sent yesterday never
appeared in the list)..

I am trying to understand the concept of distributed nutch operation.
As far as I understand from available documentation and the source
code, there are the following (high-level) components:
1. Distributed WebDB,
2. Distributed search servers.

How do I perform database population from scratch:
1. I create distributed webdb and make it accessible for all computers via nfs,
2. Inject URLs in the webdb (though WebDBInjector does not support
distributed operation)
3. Start fetching.

So, should I run fetcher on each search server so that they properly
build document indexes (locally on each search server)? Or it's
possible to run less fetchers? Or I just misunderstand the whole
concept?

Thanks in advance,
Grigory.


-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to