[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

Colin Patrick McCabe (JIRA) Wed, 14 Jan 2015 13:16:11 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277694#comment-14277694
 ]


Colin Patrick McCabe commented on HDFS-7411:
--------------------------------------------

bq. This is actually a feature, not a bug  Having our own datastructure lets us 
speed up decom by only checking blocks that are still insufficiently 
replicated. We prune out the sufficient ones each iteration. The memory 
overhead here should be pretty small since it's just an 8B reference per block, 
so with 1 million blocks this will be 8MB for a single node, or maybe 160MB for 
a full rack. Nodes are typically smaller than this this, so these are 
conservative estimates, and large decoms aren't that common.

That's a fair point.  It's too bad we can't use the existing list for this, but 
it's already being re-ordered in the block report processing code, for a 
different purpose.  I agree that it's fine as-is.

bq. On thinking about it I agree that just using a new config option is fine, 
but I'd prefer to define the DecomManager in terms of both an interval and an 
amount of work, rather than a rate. This is more powerful, and more in-line 
with the existing config. Are you okay with a new blocks.per.interval config?

Why is blocks.per.interval "more powerful" than blocks per minute?  It just 
seems annoying to have to do the mental math to figure out what to configure to 
get a certain blocks per minute going.  Also, the fact that "intervals" even 
exist is an implementation detail... you could easily imagine an 
event-triggered version that didn't do periodic polling.  I guess I don't feel 
strongly about this, but I'd like to understand the rationale more.

bq. I agree that it can lead to hangs. At a minimum, I'll add a "0 means no 
limit" config, and maybe we can set that by default. I think that NNs should 
really have enough heap headroom to handle the 10-100 of MBs of memory for 
this, it's peanuts compared to the 10s of GBs that are quite typical.

Makes sense.

> Refactor and improve decommissioning logic into DecommissionManager
> -------------------------------------------------------------------
>
>                 Key: HDFS-7411
>                 URL: https://issues.apache.org/jira/browse/HDFS-7411
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.5.1
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
> hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
> hdfs-7411.006.patch
>
>
> Would be nice to split out decommission logic from DatanodeManager to 
> DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

Reply via email to