[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302092#comment-14302092
]
Andrew Wang commented on HDFS-7411:
-----------------------------------
As discussed above, the old limiting scheme is seriously flawed. The amount of
time spent is highly variable, since it's # nodes rather than # blocks, and the
size of each node is variable. It also counts both decommissioning and not
decommissioning nodes towards the limit.
That nodes can vary in # of blocks and is really an argument for *not* using #
nodes as a limit. # of blocks is superior. The 100k was chosen as a
conservative number that will not lead to overly long wake-up times, which is
the point of this limit. In fact, with this patch we should see far more
predictable pause times for decommission work even with the old config. In
addition, it'll also result in an improvement in overall decommission speed
because of the incremental scan logic.
Because of this, I do not see any advantage to keeping this old code around.
The old code is worse in terms of predictable pause times and overall
decommissioning speed. It also has other flaws that are corrected by this
patch. The new code is compatible with the old configuration. It also requires
a lot of work to split the refactoring.
I still plan to commit tomorrow.
> Refactor and improve decommissioning logic into DecommissionManager
> -------------------------------------------------------------------
>
> Key: HDFS-7411
> URL: https://issues.apache.org/jira/browse/HDFS-7411
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 2.5.1
> Reporter: Andrew Wang
> Assignee: Andrew Wang
> Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch,
> hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch,
> hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch,
> hdfs-7411.009.patch, hdfs-7411.010.patch
>
>
> Would be nice to split out decommission logic from DatanodeManager to
> DecommissionManager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)