[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-03-14 Thread Sammi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793279#comment-16793279
 ] 

Sammi Chen edited comment on HDDS-699 at 3/15/19 3:29 AM:
--

Hi [~szetszwo], thanks very much for all your time on reviewing the patches.

the output of TestNetworkTopologyImpl fall into two categories.
 1. Purposely skip the testAncestor test when the topology only has 2 layers.
{quote}org.junit.internal.AssumptionViolatedException: got: , expected: 
is 

at org.junit.Assume.assumeThat(Assume.java:95)
 at org.junit.Assume.assumeTrue(Assume.java:41)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testAncestor(TestNetworkTopologyImpl.java:238)
{quote}
2. All the rest error messages are reported by 
TestNetworkTopologyImpl#testConcurrentAccess. In testConcurrentAccess, all the 
individual tests in the class are scheduled to run currently in different 
threads to test the robustness of the NetworkTopologyImpl. Operations include 
add, remove, re-add and query. So some query will randomly fail when the 
involved node is removed in other thread. When runing these individual tests 
one by one, there are no such errors.
{quote}Exception in thread "Thread-18" java.lang.IllegalArgumentException: 
affinityNode /1.1.1.1 doesn't have ancestor on generation 1
 at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseNodeInternal(NetworkTopologyImpl.java:498)
 at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:481)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
 at java.lang.Thread.run(Thread.java:748)
{quote}


was (Author: sammi):
Hi [~szetszwo], thanks very much for all your time to review the patches. 

the output of TestNetworkTopologyImpl fall into two categories.
 1. Purposely skip the testAncestor test when the topology only has 2 layers.
{quote}org.junit.internal.AssumptionViolatedException: got: , expected: 
is 

at org.junit.Assume.assumeThat(Assume.java:95)
 at org.junit.Assume.assumeTrue(Assume.java:41)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testAncestor(TestNetworkTopologyImpl.java:238)
{quote}
2. All the rest error messages are reported by 
TestNetworkTopologyImpl#testConcurrentAccess. In testConcurrentAccess, all the 
individual tests in the class are scheduled to run currently in different 
threads to test the robustness of the NetworkTopologyImpl. Operations include 
add, remove, re-add and query. So some query will randomly fail when the 
involved node is removed in other thread. When runing these individual tests 
one by one, there are no such errors. 
{quote}Exception in thread "Thread-18" java.lang.IllegalArgumentException: 
affinityNode /1.1.1.1 doesn't have ancestor on generation 1
 at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseNodeInternal(NetworkTopologyImpl.java:498)
 at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:481)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
 at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
 at java.lang.Thread.run(Thread.java:748)
{quote}

> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch, HDDS-699.02.patch, 
> HDDS-699.03.patch, HDDS-699.04.patch, HDDS-699.05.patch, HDDS-699.06.patch, 
> HDDS-699.07.patch, HDDS-699.08.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793053#comment-16793053
 ] 

Tsz Wo Nicholas Sze edited comment on HDDS-699 at 3/14/19 8:43 PM:
---

Tried to run TestNetworkTopologyImpl locally. There are a lot of exceptions and 
errors although the tests do not fail.
{code:java}
org.junit.internal.AssumptionViolatedException: got: , expected: is 



at org.junit.Assume.assumeThat(Assume.java:95)
at org.junit.Assume.assumeTrue(Assume.java:41)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testAncestor(TestNetworkTopologyImpl.java:238)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

Exception in thread "Thread-19" org.junit.internal.AssumptionViolatedException: 
got: , expected: is 
at org.junit.Assume.assumeThat(Assume.java:95)
at org.junit.Assume.assumeTrue(Assume.java:41)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testAncestor(TestNetworkTopologyImpl.java:238)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$9(TestNetworkTopologyImpl.java:853)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-18" java.lang.IllegalArgumentException: 
affinityNode /1.1.1.1 doesn't have ancestor on generation  1
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseNodeInternal(NetworkTopologyImpl.java:498)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:481)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-45" java.lang.IllegalArgumentException: Affinity 
node /r1/1.1.1.1 is not a member of topology
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:476)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-41" java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomExcludedNode(TestNetworkTopologyImpl.java:454)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$4(TestNetworkTopologyImpl.java:833)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-72" java.lang.IllegalArgumentException: Affinity 
node /d1/r1/1.1.1.1 is not a member of topology
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:476)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-76" java.lang.AssertionError: 
reader:/d1/r1/1.1.1.1,node1:/d2/r3/6.6.6.6,node2:/d1/r1/2.2.2.2,cost1:6,cost2:2
   

[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793053#comment-16793053
 ] 

Tsz Wo Nicholas Sze edited comment on HDDS-699 at 3/14/19 8:42 PM:
---

Tried to run TestNetworkTopologyImpl locally.  There are a lot of exceptions 
and errors although the tests does not fail.  
{code}

org.junit.internal.AssumptionViolatedException: got: , expected: is 



at org.junit.Assume.assumeThat(Assume.java:95)
at org.junit.Assume.assumeTrue(Assume.java:41)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testAncestor(TestNetworkTopologyImpl.java:238)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

Exception in thread "Thread-19" org.junit.internal.AssumptionViolatedException: 
got: , expected: is 
at org.junit.Assume.assumeThat(Assume.java:95)
at org.junit.Assume.assumeTrue(Assume.java:41)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testAncestor(TestNetworkTopologyImpl.java:238)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$9(TestNetworkTopologyImpl.java:853)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-18" java.lang.IllegalArgumentException: 
affinityNode /1.1.1.1 doesn't have ancestor on generation  1
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseNodeInternal(NetworkTopologyImpl.java:498)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:481)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-45" java.lang.IllegalArgumentException: Affinity 
node /r1/1.1.1.1 is not a member of topology
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:476)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-41" java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomExcludedNode(TestNetworkTopologyImpl.java:454)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$4(TestNetworkTopologyImpl.java:833)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-72" java.lang.IllegalArgumentException: Affinity 
node /d1/r1/1.1.1.1 is not a member of topology
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.getNode(NetworkTopologyImpl.java:476)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.pickNodes(TestNetworkTopologyImpl.java:972)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.testChooseRandomWithAffinityNode(TestNetworkTopologyImpl.java:596)
at 
org.apache.hadoop.hdds.scm.net.TestNetworkTopologyImpl.lambda$testConcurrentAccess$8(TestNetworkTopologyImpl.java:849)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-76" java.lang.AssertionError: 
reader:/d1/r1/1.1.1.1,node1:/d2/r3/6.6.6.6,node2:/d1/r1/2.2.2.2,cost1:6,cost2:2
  

[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-03-06 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786248#comment-16786248
 ] 

Tsz Wo Nicholas Sze edited comment on HDDS-699 at 3/7/19 12:48 AM:
---

Thanks [~Sammi].  Just found that the logic for getLeaf(int leafIndex, String 
excludedScope, Collection excludedNodes, int ancestorGen) is quite 
complicated.
- First of all, let's consistently use "level" instead of "generation" in the 
code.  In the getLeaf methods, let's rename ancestorGen to numLevelToExclude.
- excludedScope and excludedNodes may overlap so that we should filter out the 
overlapped nodes.
{code}
  public Node getLeaf(int leafIndex, String excludedScope,
  Collection excludedNodes, int numLevelToExclude) {
Preconditions.checkArgument(leafIndex >= 0 && numLevelToExclude >= 0);
if (excludedScope != null) {
  excludedNodes = excludedNodes.stream()
  .filter(n -> !n.getNetworkFullPath().startsWith(excludedScope))
  .collect(Collectors.toList());
}
return getLeafRecursively(leafIndex, excludedScope, excludedNodes, 
numLevelToExclude);
// see below for getLeafRecursively
  }
{code}
- Let's change getAncestor(0) to return this. It will simplify the code.
{code}
  public Node getAncestor(int generation) {
Preconditions.checkArgument(generation >= 0);
Node current = this;
while (generation > 0 && current != null) {
  current = current.getParent();
  generation--;
}
return current;
  }
{code}
- Then, we need to take care the excluded node counting with numLevelToExclude. 
 getAncestorNodeMap seems incorrect since it does not consider 
numLevelToExclude.  When consider numLevelToExclude, two excluded nodes under 
the same ancestor may or may not overlap.  We should filter out the overlap 
first as below.
{code}
  /**
   * @return a map: ancestor-node  -> node-count, where
   * the ancestor-node corresponds to the levelToReturn, and
   * the node-count corresponds to the levelToCount.
   */
  private Map getAncestorCounts(Collection nodes,
  int levelToReturn, int levelToCount) {
Preconditions.checkState(levelToReturn >= levelToCount);
if (nodes == null || nodes.size() == 0) {
  return Collections.emptyMap();
}

// map: levelToCount -> levelToReturn
final Map map = new HashMap<>();
for (Node node: nodes) {
  final Node toCount = node.getAncestor(levelToCount);
  final Node toReturn = node.getAncestor(levelToReturn - levelToCount);
  map.putIfAbsent(toCount, toReturn);
}

// map: levelToReturn -> counts
final Map counts = new HashMap<>();
for (Map.Entry entry : map.entrySet()) {
  final Node toCount = entry.getKey();
  final Node toReturn = entry.getValue();
  counts.compute(toReturn, (key, n) -> (n == null? 0: n) + 
toCount.getNumOfLeaves());
}

return counts;
  }
{code}

- Finally, here is the getLeafRecursively(..).  The other getLeaf methods can 
be removed.
{code}
  private Node getLeafRecursively(int leafIndex, String excludedScope,
  Collection excludedNodes, int numLevelToExclude) {
if (isLeafParent()) {
  return getLeafOnLeafParent(leafIndex, excludedScope, excludedNodes);
}

final int levelToReturn = NodeSchemaManager.getInstance().getMaxLevel() - 
getLevel() - 1;
final Map excludedAncestors = getAncestorCounts(
excludedNodes, levelToReturn, numLevelToExclude);
final int excludedScopeCount = getScopeNodeCount(excludedScope);


for(Node node : childrenMap.values()) {
  int leafCount = node.getNumOfLeaves();
  if (excludedScope != null && 
excludedScope.startsWith(node.getNetworkFullPath())) {
leafCount -= excludedScopeCount;
  }
  leafCount -= excludedAncestors.get(node);
  if (leafCount > 0) {
if (leafIndex < leafCount) {
  return ((InnerNodeImpl)node).getLeafRecursively(
  leafIndex, excludedScope, excludedNodes, numLevelToExclude);
}
leafIndex -= leafCount;
  }
}
return null;
  }
{code}



was (Author: szetszwo):
Thanks [~Sammi].  Just found that the logic for getLeaf(int leafIndex, String 
excludedScope, Collection excludedNodes, int ancestorGen) is quite 
complicated.
- First of all, let's consistently use "level" instead of "generation" in the 
code.  In the getLeaf methods, let's rename ancestorGen to numLevelToExclude.
- excludedScope and excludedNodes may overlap so that we should filter out the 
overlapped nodes.
{code}
  public Node getLeaf(int leafIndex, String excludedScope,
  Collection excludedNodes, int numLevelToExclude) {
Preconditions.checkArgument(leafIndex >= 0 && numLevelToExclude >= 0);
if (excludedScope != null) {
  excludedNodes = excludedNodes.stream()
  .filter(n -> !n.getNetworkFullPath().startsWith(excludedScope))
  .collect(Collectors.toList());
}

[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-03-06 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786248#comment-16786248
 ] 

Tsz Wo Nicholas Sze edited comment on HDDS-699 at 3/7/19 12:50 AM:
---

Thanks [~Sammi].  Just found that the logic for getLeaf(int leafIndex, String 
excludedScope, Collection excludedNodes, int ancestorGen) is quite 
complicated.
- First of all, let's consistently use "level" instead of "generation" in the 
code.  In the getLeaf methods, let's rename ancestorGen to numLevelToExclude.
- excludedScope and excludedNodes may overlap so that we should filter out the 
overlapped nodes.
{code}
  public Node getLeaf(int leafIndex, String excludedScope,
  Collection excludedNodes, int numLevelToExclude) {
Preconditions.checkArgument(leafIndex >= 0 && numLevelToExclude >= 0);
if (excludedScope != null) {
  excludedNodes = excludedNodes.stream()
  .filter(n -> !n.getNetworkFullPath().startsWith(excludedScope))
  .collect(Collectors.toList());
}
return getLeafRecursively(leafIndex, excludedScope, excludedNodes, 
numLevelToExclude);
// see below for getLeafRecursively
  }
{code}
- Let's change getAncestor(0) to return this. It will simplify the code.
{code}
  public Node getAncestor(int generation) {
Preconditions.checkArgument(generation >= 0);
Node current = this;
while (generation > 0 && current != null) {
  current = current.getParent();
  generation--;
}
return current;
  }
{code}
- Then, we need to take care the excluded node counting with numLevelToExclude. 
 getAncestorNodeMap seems incorrect since it does not consider 
numLevelToExclude.  When consider numLevelToExclude, two excluded nodes under 
the same ancestor may or may not overlap.  We should filter out the overlap 
first as below.
{code}
  /**
   * @return a map: ancestor-node  -> node-count, where
   * the ancestor-node corresponds to the levelToReturn, and
   * the node-count corresponds to the levelToCount.
   */
  private Map getAncestorCounts(Collection nodes,
  int levelToReturn, int levelToCount) {
Preconditions.checkState(levelToReturn >= levelToCount);
if (nodes == null || nodes.size() == 0) {
  return Collections.emptyMap();
}

// map: levelToCount -> levelToReturn
final Map map = new HashMap<>();
for (Node node: nodes) {
  final Node toCount = node.getAncestor(levelToCount);
  final Node toReturn = toCount.getAncestor(levelToReturn - levelToCount);
  map.putIfAbsent(toCount, toReturn);
}

// map: levelToReturn -> counts
final Map counts = new HashMap<>();
for (Map.Entry entry : map.entrySet()) {
  final Node toCount = entry.getKey();
  final Node toReturn = entry.getValue();
  counts.compute(toReturn, (key, n) -> (n == null? 0: n) + 
toCount.getNumOfLeaves());
}

return counts;
  }
{code}

- Finally, here is the getLeafRecursively(..).  The other getLeaf methods can 
be removed.
{code}
  private Node getLeafRecursively(int leafIndex, String excludedScope,
  Collection excludedNodes, int numLevelToExclude) {
if (isLeafParent()) {
  return getLeafOnLeafParent(leafIndex, excludedScope, excludedNodes);
}

final int levelToReturn = NodeSchemaManager.getInstance().getMaxLevel() - 
getLevel() - 1;
final Map excludedAncestors = getAncestorCounts(
excludedNodes, levelToReturn, numLevelToExclude);
final int excludedScopeCount = getScopeNodeCount(excludedScope);


for(Node node : childrenMap.values()) {
  int leafCount = node.getNumOfLeaves();
  if (excludedScope != null && 
excludedScope.startsWith(node.getNetworkFullPath())) {
leafCount -= excludedScopeCount;
  }
  leafCount -= excludedAncestors.get(node);
  if (leafCount > 0) {
if (leafIndex < leafCount) {
  return ((InnerNodeImpl)node).getLeafRecursively(
  leafIndex, excludedScope, excludedNodes, numLevelToExclude);
}
leafIndex -= leafCount;
  }
}
return null;
  }
{code}



was (Author: szetszwo):
Thanks [~Sammi].  Just found that the logic for getLeaf(int leafIndex, String 
excludedScope, Collection excludedNodes, int ancestorGen) is quite 
complicated.
- First of all, let's consistently use "level" instead of "generation" in the 
code.  In the getLeaf methods, let's rename ancestorGen to numLevelToExclude.
- excludedScope and excludedNodes may overlap so that we should filter out the 
overlapped nodes.
{code}
  public Node getLeaf(int leafIndex, String excludedScope,
  Collection excludedNodes, int numLevelToExclude) {
Preconditions.checkArgument(leafIndex >= 0 && numLevelToExclude >= 0);
if (excludedScope != null) {
  excludedNodes = excludedNodes.stream()
  .filter(n -> !n.getNetworkFullPath().startsWith(excludedScope))
  .collect(Collectors.toList());
}
   

[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-03-04 Thread Sammi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784040#comment-16784040
 ] 

Sammi Chen edited comment on HDDS-699 at 3/5/19 4:14 AM:
-

Hi [~szetszwo], thanks for the time.  04.patch is uploaded to address the 
issues reported by checkstyle and findbugs. I'm still investigating the build 
compile failure issue caused by 
{quote}cp: cannot stat 
'/testptch/hadoop/hadoop-ozone/objectstore-service/target/hadoop-ozone-objectstore-service-0.4.0-SNAPSHOT-plugin.jar':
 No such file or directory{quote}
The issue doesn't happen locally.



was (Author: sammi):
Hi [~szetszwo], thanks for the time.  04.patch is uploaded to address the 
issues reported by checkstyle and findbugs. I'm still investigating the build 
compile failure issue caused by 
{quote}cp: cannot stat 
'/testptch/hadoop/hadoop-ozone/objectstore-service/target/hadoop-ozone-objectstore-service-0.4.0-SNAPSHOT-plugin.jar':
 No such file or directory{quote}
The issue doesn't happen locally


> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch, HDDS-699.02.patch, 
> HDDS-699.03.patch, HDDS-699.04.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-01-29 Thread Sammi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753763#comment-16753763
 ] 

Sammi Chen edited comment on HDDS-699 at 1/30/19 3:09 AM:
--

Hi [~junjie], thanks for your review and feedbacks.
{quote}For choosing a node (randomly or not), do we really need ancestor 
parameter? The scope should already contain the branch level info. Isn't it?
{quote}
The current NetworkTopology implementation allows user to define topology with 
any levels, so the node chosen flexibility is provided with the "ancestorGen" 
parameter. For a specific network topology, you are right, the level info is 
fixed, in the container/pipeline placement policy implementation. 
 The rest will be address in the coming patch.


was (Author: sammi):
Hi [~junjie],  thanks for your review and feedbacks. 

{code:java}
+// Remove any trailing NetConf.PATH_SEPARATOR
+int len = path.length();
+while (len > 0 && path.charAt(len-1) == NetConf.PATH_SEPARATOR) {
+  path = path.substring(0, len-1);
+  len = path.length();
+}
+return path;
+  }
{code}
It's to remove the trailing "/". "/" in the middle can not be removed. 
{quote}
For choosing a node (randomly or not), do we really need ancestor parameter? 
The scope should already contain the branch level info. Isn't it?
{quote}
The current NetworkTopology implementation allows user to define topology with 
any levels, so the node chosen flexibility is provided with the "ancestorGen" 
parameter. For a specific network topology, you are right, the level info is 
fixed, in the container/pipeline placement policy implementation. 
The rest will be address in the coming patch. 



> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-01-25 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752183#comment-16752183
 ] 

Yiqun Lin edited comment on HDDS-699 at 1/25/19 11:52 AM:
--

For the unit test, some comments from me:

 
 *TestNetworkTopology*
 Line 51: The log instance getting way can be simplified to 
{{LoggerFactory.getLogger(TestNetworkTopology.class);}}

*TestNetworkTopology#testAncestor*
 line 209: This is same to the line 200. Should this intend to be 
{{isSameAncestor(dataNodes[1], dataNodes[2], maxLevel - 1)}}?

*TestNetworkTopology#testAddRemove*
 We should add some corner cases as I mentioned above; 
https://issues.apache.org/jira/browse/HDDS-699?focusedCommentId=16747999=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16747999

*TestNetworkTopology#testCreateInvalidTopology*
 Line 189: I prefer to verify this with a more detail message 
{{assertTrue(e.getMessage().startsWith("Failed to add...Its path depth is not 
..."));}}. It will be easy to understand.

Additional one comment for the place:
 *NetworkTopology#sortByDistanceCost*
 Line 599: This can be simplified to {{Collections.shuffle(list)}}. It will 
reuse the random instance inside {{Collections}}.


was (Author: linyiqun):
For the unit test, some comments from me:

*TestNetworkTopology#testAncestor*
 line 209: This is same to the line 200. Should this intend to be 
{{isSameAncestor(dataNodes[1], dataNodes[2], maxLevel - 1)}}?

*TestNetworkTopology#testAddRemove*
 We should add some corner cases as I mentioned above; 
https://issues.apache.org/jira/browse/HDDS-699?focusedCommentId=16747999=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16747999

*TestNetworkTopology#testCreateInvalidTopology*
 Line 189: I prefer to verify this with a more detail message 
{{assertTrue(e.getMessage().startsWith("Failed to add...Its path depth is not 
..."));}}. It will be easy to understand.

Additional one comment for the place:
 *NetworkTopology#sortByDistanceCost*
 Line 599: This can be simplified to {{Collections.shuffle(list)}}. It will 
reuse the random instance inside {{Collections}}.

> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-01-25 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752183#comment-16752183
 ] 

Yiqun Lin edited comment on HDDS-699 at 1/25/19 11:39 AM:
--

For the unit test, some comments from me:

*TestNetworkTopology#testAncestor*
 line 209: This is same to the line 200. Should this intend to be 
{{isSameAncestor(dataNodes[1], dataNodes[2], maxLevel - 1)}}?

*TestNetworkTopology#testAddRemove*
 We should add some corner cases as I mentioned above; 
https://issues.apache.org/jira/browse/HDDS-699?focusedCommentId=16747999=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16747999

*TestNetworkTopology#testCreateInvalidTopology*
 Line 189: I prefer to verify this with a more detail message 
{{assertTrue(e.getMessage().startsWith("Failed to add...Its path depth is not 
..."));}}. It will be easy to understand.

Additional one comment for the place:
 *NetworkTopology#sortByDistanceCost*
 Line 599: This can be simplified to {{Collections.shuffle(list)}}. It will 
reuse the random instance inside {{Collections}}.


was (Author: linyiqun):
For the unit test, some comments from me:

*TestNetworkTopology#testAncestor*
 line 209: This is same to the line 200. Should this intend to be 
{{isSameAncestor(dataNodes[1], dataNodes[2], maxLevel - 1)}}?

*TestNetworkTopology#testAddRemove*
 We should add some corner cases as I mentioned above; 
https://issues.apache.org/jira/browse/HDDS-699?focusedCommentId=16747999=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16747999

*TestNetworkTopology#testCreateInvalidTopology*
 Line 189: I prefer to verify this with a more detail message 
{{assertTrue(e.getMessage().startsWith("Failed to add...Its path depth is 
not"));}}. It will be easy to understanding.

Additional one comment for the place:
 *NetworkTopology#sortByDistanceCost*
 Line 599: This can be simplified to {{Collections.shuffle(list)}}. It will 
reuse the random instance inside {{Collections}}.

> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-01-24 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751901#comment-16751901
 ] 

Xiaoyu Yao edited comment on HDDS-699 at 1/25/19 6:07 AM:
--

Thanks [~Sammi] for working on this. Patch v2 LGTM overall. Here are some 
comments. I will add more comments on unit test later. 

 

DatanodeDetails.java

Line 220: is there a reason to remove the compareTo @Override annotation?

 

Node.java

Line 45: NIT: typo "contamination" -> "concatenation"

 

Line 72: do we really want to allow setLevel() explictly or it is for testing 
only? This may mess up with the

actual level that is set up via setParent().

 

InnerNode.java

Line 89: does the leafIndex count the excludedNodes?

 

InnerNodeImpl.java

Line 295-301: can we document this in the getLeaf() API?

 

network-topology-nodegroup.xml

Line 56 and 68:  the nodegroup in 68 does not seem to match with the prefix ng 

NodeSchema.java

Line 80-90: maybe we can add a build class to help reducing the # of 
constructors.

 

Line 102: do we assume case sensitive for the network path and prefix here?

 

NodeSchemaLoader.java

Line 126/128/130/186/187: NIT can we get these perdefined tag name defined as 
static const?

 

NodeSchemaManager.java

Line 98: can we add javadoc for completePath()?


was (Author: xyao):
Thanks [~Sammi] for working on this. Patch v2 LGTM overall. Here are some 
comments. I will add more comments on unit test later. 

 

DatanodeDetails.java

Line 220: is there a reason to remove the compareTo @Override annotation?

 

Node.java

Line 45: NIT: typo "contamination" -> "concatenation"

 

Line 72: do we really want to allow setLevel() explictly or it is for testing 
only? This may mess up with the

actual level that is set up via setParent().

 

InnerNode.java

Line 89: does the leafIndex count the excludedNodes?

 

InnerNodeImpl.java

Line 295-301: can we document this in the getLeaf() API?

 

 

NodeSchema.java

Line 80-90: maybe we can add a build class to help reducing the # of 
constructors.

 

Line 102: do we assume case sensitive for the network path and prefix here?

 

NodeSchemaLoader.java

Line 126/128/130/186/187: NIT can we get these perdefined tag name defined as 
static const?

 

NodeSchemaManager.java

Line 98: can we add javadoc for completePath()?

> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-699) Detect Ozone Network topology

2019-01-17 Thread Sammi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744911#comment-16744911
 ] 

Sammi Chen edited comment on HDDS-699 at 1/17/19 11:23 AM:
---

Fix check style issues and failed test.


was (Author: sammi):
Fix check style issues and failed test

> Detect Ozone Network topology
> -
>
> Key: HDDS-699
> URL: https://issues.apache.org/jira/browse/HDDS-699
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Sammi Chen
>Priority: Major
> Attachments: HDDS-699.00.patch, HDDS-699.01.patch
>
>
> Traditionally this has been implemented in Hadoop via script or customizable 
> java class. One thing we want to add here is the flexible multi-level support 
> instead of fixed levels like DC/Rack/NG/Node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org