[
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994491#comment-14994491
]
Mingliang Liu commented on HDFS-9379:
-------------------------------------
Thanks for your review [~arpitagarwal].
{quote}
Did you get a chance to test it manually?
{quote}
Yes I did test this manually. I ran the test with different combination of
arguments, including {{-namenode}}, {{-datanodes}} and {{blocksPerReport}}. If
the {{-datatnodes}} is greater than 9, the trunk code will run the benchmark
successfully with this patch, and it will fail without this patch. The failing
code is the assertion which checks the lexicographical order of datanodes.
{quote}
The unit test Test NNThroughputBenchmark looks inadequate. It passed even when
I replaced the dnIdx computation with zero.
{quote}
The {{TestNNThroughputBenchmark}} seems a driver to run the benchmark, instead
of unit testing the benchmark itself. Thus I did not change it. If we make the
{{dnIdx}} always as zero when searching the datatnode index in {{datanodes}}
array given datanode info, the test could pass as the generated block will
always be added to the first datanode. The benchmark itself allows this, though
the test results will be dubious.
{quote}
I looked through the remaining usages of datanodes for any dependencies on
lexical ordering and didn't find any.
{quote}
That's true. The {{BlockReportStats}} is the only use case I found that depends
on the lexical ordering of {{datanodes}} array. I ran other tests and they look
good when the {{-datanodes}} or {{-threads}} is greater than 10.
> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --------------------------------------------------------------------------
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Mingliang Liu
> Assignee: Mingliang Liu
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on
> sorted {{datanodes}} array in the lexicographical order of datanode's
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when
> filling the {{datanodes}}, see [the
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search
> against the {{datanodes}} array, see [the
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will
> be invalid as the string value of datanode index is not in lexicographical
> order any more. For example,
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead
> of using binary search.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)