Re: Configuration Nutch in cluster mode
Hallo Sebastian! I have now installed hadoop, unfortunately there are problems. Will make a post.. Thanks Mike Am Di., 17. Jan. 2023 um 09:49 Uhr schrieb Sebastian Nagel : > Hi Mike, > > the Nutch configuration files are included in the job file found in > runtime/deploy after build. This means you need to compile Nutch yourself > if used in "distributed" mode. > > For exercising, you can first work in "pseudo-distributed" mode, i.e. > on a single-node Hadoop cluster. All commands are the same than in fully > distributed mode. > > If it helps, I prepared some setup scripts to run Nutch in > pseudo-distributed mode: >https://github.com/sebastian-nagel/nutch-test-single-node-cluster > > Best, > Sebastian > > On 1/15/23 04:26, Mike wrote: > > I will now try to configure the bot url etc. before the building, > > but how and where do I configure between the crawls e.g. number of pages > > per host? > > > > where do I configure nutch in cluster mode? > > > > thx, mike > > >
Re: Configuration Nutch in cluster mode
Hi Mike, the Nutch configuration files are included in the job file found in runtime/deploy after build. This means you need to compile Nutch yourself if used in "distributed" mode. For exercising, you can first work in "pseudo-distributed" mode, i.e. on a single-node Hadoop cluster. All commands are the same than in fully distributed mode. If it helps, I prepared some setup scripts to run Nutch in pseudo-distributed mode: https://github.com/sebastian-nagel/nutch-test-single-node-cluster Best, Sebastian On 1/15/23 04:26, Mike wrote: I will now try to configure the bot url etc. before the building, but how and where do I configure between the crawls e.g. number of pages per host? where do I configure nutch in cluster mode? thx, mike
Configuration Nutch in cluster mode
I will now try to configure the bot url etc. before the building, but how and where do I configure between the crawls e.g. number of pages per host? where do I configure nutch in cluster mode? thx, mike