[Hadoop Wiki] Update of "VirtualCluster" by SteveLoughran

Apache Wiki Wed, 24 Jun 2009 03:16:56 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by SteveLoughran:
http://wiki.apache.org/hadoop/VirtualCluster

The comment on the change is:
more troublespots. 

------------------------------------------------------------------------------
   i. All machine's(both VM's and physical machines) public key are distributed 
to all "~/.ssh/authorized_keys" file.
   i. conf/hadoop-site.xml file is similar for all the machines.
   i. /etc/hosts file must contain all the machines(VM,Physical machine) IP and 
Hostname.
-  i. The local hostname entry in /etc/hosts must not point to 127.0.0.1 or any 
other loopback address (some laptop-friendly Unix distributions do this). It 
should be to the assigned IP address.
+  i. The local hostname entry in /etc/hosts must not point to 127.0.0.1 or any 
other loopback address (some laptop-friendly Linux distributions do this). It 
should be to the assigned IP address.
   i. conf/slaves must contain the hostname of all slaves including VM's and 
physical machine.
   i. conf/masters must contain only master's hostname.
   i. both conf/masters and conf/slaves files must be similar in all the 
participating machines.
@@ -28, +28 @@

  Here are things that can cause trouble.
   1. Multiple virtual network adapters. It is simpler with one network 
adapter/node
   1. Machines changing hostname/IPAddress on a reboot. For a long-lived 
virtual cluster you need stable machine names.
+  1. Machines whose hostname doesn't match the hostname the network assigns 
it. It thinks it is "granton", the network thinks it is "dhcp-169-45", that 
being the name everything else talks to it by.
+  1. Machines that think they have the same hostname. You get this if you 
clone VMs and don't rename them.
   1. Pauses of an entire VM for 5-10s or longer. This happens when the virtual 
host is overloaded and your VM has been swapped out. Host less VMs, or have 
them ask for less memory.
+  1. Wierd clock drift where it can even run backwards. Again, don't overload 
your machines.
+  1. All redundant virtual servers (e.g. Namenode and secondary NN) being 
hosted on the same physical machine. At that point, you don't have redundancy 
or failover any more.

[Hadoop Wiki] Update of "VirtualCluster" by SteveLoughran

Reply via email to