for Q2 I'm already using HDFS but in pseudo distributed, ie I have hadoop running on a single server. So if I had a server it should run ? That would save a lot of time.
-Raymond- 2010/10/8 CatOs Mandros <[email protected]> > Hi MilleBii, > > On Fri, Oct 8, 2010 at 7:21 PM, MilleBii <[email protected]> wrote: > > > How easy is it scale-up nutch following demand ? > > > > > Its kinda easy when you already have a distributed environement working. > > > > Q1 : Suppose I have 2 servers and I want to add a 3 third one. Modulo the > > adequate configuration will hadoop just add the server and manage to > > distribute the data on hdfs: or I need to do something special. > > > > You have to add the machine to the cluster and hadoop will pick it up. > > Just configure the machine, > > start the services: > $ cd path/to/hadoop > $ bin/hadoop-daemon.sh start datanode > $ bin/hadoop-daemon.sh start tasktracker > > and you should be up and running. > > > > > > > Q2 : other question, from single server/pseudo distributed to a 2 server > > distributed. It is just a matter of configuring hadoop or is there more > to > > it ? > > > > > Just configuring hadoop and telling to nutch to use the distributed mode. > You'll have to migrate your data to HDFS tough. > > -- > > -MilleBii- > > > -- -MilleBii-

