Should the installation paths be the same in all the nodes? Most documentation seems to suggest that it is _*recommended*_ to have the _*same *_ paths in all the nodes. But what is the workaround, if, because of some reason, one isn't able to have the same path?
That's the problem we are facing right now. After making Hadoop work perfectly in a 2-node cluster, when we tried to accommodate a 3rd machine, we realised that this machine doesn't have a E:, which is where the installation of hadoop is in the other 2 nodes. All our machines are Windows machines. The possible solutions are: 1) Move the installations in M1 & M2 to a drive that is present in M3. We will keep this as the last option. 2) Map a folder in M3's D: to E:. We used the "subst" command to do this. But when we tried to start DFS, it wasn't able to find the hadoop installation. Just to verify, we tried a ssh to the localhost, and were unable to find the mapped drive. It's only visible as a folder of D:. Whereas, in the basic cygwin prompt, we are able to view E:. 3) Partition M3's D drive and create an E. This carries the risk of loss of data. So, what should we do? Is there any way we can specify in the NameNode the installation paths of hadoop in each of the remaining nodes? Or is there some environment variable that can be set, which can make the hadoop installation path specific to each machine? Thanks, Sridhar