[ https://issues.apache.org/jira/browse/HDFS-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tsz Wo Nicholas Sze resolved HDFS-1379. --------------------------------------- Resolution: Not a Problem I believe this is not a problem anymore after other JIRAs such as HDFS-4963. Please feel free to reopen this if it is not the case. Resolving ... > Multihoming brokenness in HDFS > ------------------------------ > > Key: HDFS-1379 > URL: https://issues.apache.org/jira/browse/HDFS-1379 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, namenode > Affects Versions: 0.20.1 > Environment: Multi-homed namenode and datanodes. hadoop-0.20.1 > (cloudera distribution on linux) > Reporter: Matthew Byng-Maddick > > We have a setup where - because we only have a very few machines (4 x 16 > core) we're looking at co-locating namenodes and datanodes. We also have > front-end and back-end networks. Set-up is something like: > * machine1 > ** front-end 10.18.80.80 > ** back-end 192.168.24.40 > * machine2 > ** front-end 10.18.80.82 > ** back-end 192.168.24.41 > * machine3 > ** front-end 10.18.80.84 > ** back-end 192.168.24.42 > * machine4 > ** front-end 10.18.80.86 > ** back-end 192.168.24.43 > On each, the property *slave.host.name* is configured with the 192. address, > (the *.dns.interface settings don't actually seem to help, but that's a > separate problem), and the *dfs.datanode.address* is bound to the > 192.168.24.x address on :50010, similarly the *dfs.datanode.ipc.address* is > bound there. > In order to get efficient use of our machines, we bring up a namenode on one > of them (this then rsyncs the latest namenode fsimage etc) by bringing up a > VIP on each side (we use the 10.18.80.x side for monitoring, rather than > actual hadoop comms), and binding the namenode to that - on the inside this > is 192.168.24.19. > The namenode now knows about 4 datanodes - 192.168.24.40/1/2/3. These > datanodes know how they're bound. However, when the datanode is telling an > external hdfs client where to store the blocks, it gives out > 192.168.24.19:50010 as one of the addresses (despite the datanode not being > bound there) - because that's where the datanode->namenode RPC comes from. > This is wrong because if you've bound the datanode explicitly (using > *dfs.datanode.address*) then that's should be the only address the namenode > can give out (it's reasonable, given your comms model not to support NAT > between clients and data slaves). If you bind it to * then your normal rules > for slave.host.name, dfs.datanode.dns.interface etc should take precedence. > This may already be fixed in later releases than 0.20.1 - but if it isn't it > should probably be - you explicitly allow binding of the datanode addresses - > it's unreasonable to expect that comms to the datanode will always come from > those addresses - especially in multi-homed environments (and separating > traffic out by network - especially when dealing with large volumes of data) > is useful. -- This message was sent by Atlassian JIRA (v6.2#6252)