[
https://issues.apache.org/jira/browse/HDFS-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037361#comment-15037361
]
Brahma Reddy Battula commented on HDFS-9083:
--------------------------------------------
[~xyao] thanks for pointing same..
{code}
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(capacities.length)
.hosts(new String[]{"localhost", "localhost"})
.racks(new String[]{"rack0", "rack1"}).simulatedCapacities(capacities).build()
{code}
2 DNs are started with "rack1". Ideally we should not create 2 DNs with the
same hostname.And Pinning depends on favoredNodes.DFSClient#create(..) only
uses host:port, if favoredNodes is created by new InetSocketAddress(ip, port)
DFSClient will attempt a reverse lookup locally to get host:port, instead of
sending ip:port directly to NameNode.
.
MiniDFSCluster use fake hostname "host1.foo.com" to start DataNodes.DFSClient
doesn't use StaticMapping. So if DFSClient do reverse lookup, "127.0.0.1:8020"
becomes "localhost:8020".
Fix can be like following which I did same in branch-2 and Trunk.
{code}
+ String[] hosts = {"host0", "host1"};
String[] racks = { RACK0, RACK1 };
int numOfDatanodes = capacities.length;
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(capacities.length)
- .hosts(new String[]{"localhost", "localhost"})
- .racks(racks).simulatedCapacities(capacities).build();
+ .hosts(hosts).racks(racks).simulatedCapacities(capacities).build();
try {
cluster.waitActive();
@@ -377,7 +377,10 @@ public void testBalancerWithPinnedBlocks() throws
Exception {
long totalUsedSpace = totalCapacity * 8 / 10;
InetSocketAddress[] favoredNodes = new InetSocketAddress[numOfDatanodes];
for (int i = 0; i < favoredNodes.length; i++) {
- favoredNodes[i] = cluster.getDataNodes().get(i).getXferAddress();
+ // DFSClient will attempt reverse lookup. In case it resolves
+ // "127.0.0.1" to "localhost", we manually specify the hostname.
+ int port = cluster.getDataNodes().get(i).getXferAddress().getPort();
+ favoredNodes[i] = new InetSocketAddress(hosts[i], port);
{code}
> Replication violates block placement policy.
> --------------------------------------------
>
> Key: HDFS-9083
> URL: https://issues.apache.org/jira/browse/HDFS-9083
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.6.0
> Reporter: Rushabh S Shah
> Assignee: Rushabh S Shah
> Priority: Blocker
> Fix For: 2.7.2, 2.6.3
>
> Attachments: HDFS-9083-branch-2.6.patch, HDFS-9083-branch-2.7.patch
>
>
> Recently we are noticing many cases in which all the replica of the block are
> residing on the same rack.
> During the block creation, the block placement policy was honored.
> But after node failure event in some specific manner, the block ends up in
> such state.
> On investigating more I found out that BlockManager#blockHasEnoughRacks is
> dependent on the config (net.topology.script.file.name)
> {noformat}
> if (!this.shouldCheckForEnoughRacks) {
> return true;
> }
> {noformat}
> We specify DNSToSwitchMapping implementation (our own custom implementation)
> via net.topology.node.switch.mapping.impl and no longer use
> net.topology.script.file.name config.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)