Hello, I am trying to understand the concept of distributed nutch operation. As far as I understand from available documentation and the source code, there are the following (high-level) components: 1. Distributed WebDB, 2. Distributed search servers.
How do I perform database population from scratch: 1. I create distributed webdb and make it accessible for all computers via nfs, 2. Inject URLs in the webdb (though WebDBInjector does not support distributed operation) 3. Start fetching. So, should I run fetcher on each search server so that they properly build document indexes (locally on each search server)? Or it's possible to run less fetchers? Or I just misunderstand the whole concept? Thanks in advance, Grigory. ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
