Hi all, I am trying to configure and start a hadoop cluster on EC2. I got some problems here.
1. Can I share hadoop code and its configuration across nodes? Say I have a distributed file system running in the cluster and all the nodes could see the hadoop code and conf there. So all the nodes will use the same copy of code and conf to run. Is it possible? 2. if all the nodes could share hadoop and conf, does it mean I can launch hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)? 3. I think I specify and master and slave correctly. When I launch hadoop from master node, no tasktracker or datanode was launched on slave nodes. The log on slave nodes says: ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID = 1048149291; datanode namespaceID = 313740560 what is the problem? Thanks, -Gang