In this tutorial: http://wiki.apache.org/nutch/NutchHadoopTutorial tutorial the following is stated:
"every node you wish to include within your cluster e.g. both Nutch and Hadoop packages should be installed in every machine." I am curious as to why every node must contain a nutch distribution? I understand that they conf files in <NUTCH_HOME>/conf must be copied to every node's $HADOOP_HOME/conf directory. But isn't this the extent of Nutch files that need to be copied to each node? -- View this message in context: http://lucene.472066.n3.nabble.com/Question-regarding-NutchHadoopTutorial-tp3751036p3751036.html Sent from the Nutch - User mailing list archive at Nabble.com.