Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by AmosWaterland: http://wiki.apache.org/nutch/NutchHadoopTutorial The comment on the change is: Explain that you need to format the distributed filesystem before its first use. ------------------------------------------------------------------------------ The dfs.replication property states how many servers a single file should be replicated to before it becomes available. Because we are using only a single server for right now we have this at 1. If you set this value higher than the number of data nodes that you have available then you will start seeing alot of (Zero targets found, forbidden1.size=1) type errors in the logs. We will increase this value as we add more nodes. + Before you start the hadoop server, make sure you format the distributed filesystem for the name node: + + {{{ + bin/hadoop namenode -format + }}} + Now that we have our hadoop configured and our slaves file configured it is time to start up hadoop on a single node and test that it is working properly. To start up all of the hadoop servers on the local machine (name node, data node, mapreduce tracker, job tracker) use the following command as the nutch user: {{{ @@ -560, +566 @@ http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=144 + Hadoop 0.1.2-dev API: + + http://www.netlikon.de/docs/javadoc-hadoop-0.1/overview-summary.html + _______________________________________________ Nutch-cvs mailing list Nutch-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-cvs