This may not answer why it is designed this way, but it should give you more insight into how it is done.
Here is how the network resolution<http://books.google.com/books?id=TQqSwRScVhoC&pg=PA58&lpg=PA58&dq=network+hostname++hadoop+operations&source=bl&ots=81GJwGnQ-j&sig=Im6hgfT1E9HsouI1Ez1yMyL3pZQ&hl=en&sa=X&ei=oSmnUtPJNoavkAfj8oCgCg&ved=0CDoQ6AEwAA#v=onepage&q=network%20hostname%20%20hadoop%20operations&f=false > happens and the complication may be improved but this is what it is today On Mon, Dec 9, 2013 at 8:31 PM, Rural Hunter <[email protected]> wrote: > Hi, > > I have configured a hadoop/hbase cluster recently and found it's really a > mess with all those DNS, hostname and /etc/hosts configuration. There are > many questions related to this all over the internet. So I'm wondering why > hadoop/hbase designed in such a strange way, which is very abnormal > comparing with other network/distribution applications. In normal > applications, DNS is used to indentify other servers(logical or physical), > not the server itself. But I'm seeing this weired behavior in hadoop/hbase. > > Say we have server1 and server2 configured this way: > > server1(ip 192.168.1.2) > hostname: server1 > /etc/hosts: > 127.0.0.1 localhost,server1 > 192.168.1.3 server2 > > server2(ip 192.168.1.3) > hostname: server2 > /etc/hosts: > 127.0.0.1 localhost,server2 > 192.168.1.2 server1 > > With the configuration above, I'm seesing many cases hadoop/hbase trying > to connect to localhost while it actaully should connect to another server. > I believe this is because server1 reported its hostname as 'localhost' to > server2 and server2 tries to use 'localhost' to connect to server1. But it > shouldn't work that way. In normal network applications, server2 shouldn't > try to connect to server1 with what server1 reported. If server2 inits the > connection, it should use the DNS or /etc/hosts to resolve server1. If > server 1 inits the connection, server2 should use the ip it gets from the > already established connection from server1. There shouldn't be any > confusion or mess. > > I don't see why hadoop/hbase can not use the same logic to handle the > DNS/hosts/hostname mess. Anyone can resolve my confusion? >
