[
https://issues.apache.org/jira/browse/HDFS-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767913#comment-16767913
]
Íñigo Goiri commented on HDFS-14268:
------------------------------------
Thanks [~ayushtkn] and [~tasanuma0829] for the comments.
I've never gone too deep into the setup for EC so I may miss some things.
Before the patch, we used to have 12 DNs and all of them joined both
subclusters.
After [^HDFS-14268-HDFS-13891.002.patch], we have 8 DNs in one subcluster and 8
in the other.
Then there are a couple problems here:
* As [~tasanuma0829] mentioned, 8 is not enough and we will have the low
redundancy issue.
* The {{getECBlockGroupStats()}} in the Router merges the two results properly
but then we compare just against one of the subclusters.
I think reporting 2 low redundancy blocks is actually good as it tests
something that is not just 0s.
Then, what we should do is a proper sum of the 2 Namenode stats in the unit
test and compare against that.
I think I can do this in this JIRA.
Thoughts?
> RBF: Fix the location of the DNs in getDatanodeReport()
> -------------------------------------------------------
>
> Key: HDFS-14268
> URL: https://issues.apache.org/jira/browse/HDFS-14268
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Íñigo Goiri
> Assignee: Íñigo Goiri
> Priority: Major
> Attachments: HDFS-14268-HDFS-13891.000.patch,
> HDFS-14268-HDFS-13891.001.patch, HDFS-14268-HDFS-13891.002.patch
>
>
> When getting all the DNs in the federation, the Router queries each of the
> subclusters and aggregates them assigning the subcluster id to the location.
> This query uses a {{HashSet}} which provides a "random" order for the results.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]