[
https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949978#comment-15949978
]
Vinitha Reddy Gankidi edited comment on HDFS-11384 at 3/30/17 10:29 PM:
------------------------------------------------------------------------
[~shv] I'm leaning towards (4) instead of (3).
{{isGoodBlockCandidate}} needs a global view of the block replicas. Also there
is some additional logic to deal with erasure coded(EC) blocks and this may be
a blocker for reading from DNs. [~zhz] you probably have more context regarding
the EC blocks.
{code}
/**
* Decide if the block/blockGroup is a good candidate to be moved from source
* to target. A block is a good candidate if
* 1. the block is not in the process of being moved/has not been moved;
* 2. the block does not have a replica/internalBlock on the target;
* 3. doing the move does not reduce the number of racks that the block has
*/
private boolean isGoodBlockCandidate(StorageGroup source, StorageGroup target,
StorageType targetStorageType, DBlock block) {
{code}
I agree that (2) and (4) are complimentary.
was (Author: redvine):
[~shv] I'm leaning towards reading from (4) instead of (3).
{{isGoodBlockCandidate}} needs a global view of the block replicas. Also there
is some additional logic to deal with erasure coded(EC) blocks and this may be
a blocker for reading from DNs. [~zhz] you probably have more context regarding
the EC blocks.
{code}
/**
* Decide if the block/blockGroup is a good candidate to be moved from source
* to target. A block is a good candidate if
* 1. the block is not in the process of being moved/has not been moved;
* 2. the block does not have a replica/internalBlock on the target;
* 3. doing the move does not reduce the number of racks that the block has
*/
private boolean isGoodBlockCandidate(StorageGroup source, StorageGroup target,
StorageType targetStorageType, DBlock block) {
{code}
I agree that (2) and (4) are complimentary.
> Add option for balancer to disperse getBlocks calls to avoid NameNode's
> rpc.CallQueueLength spike
> -------------------------------------------------------------------------------------------------
>
> Key: HDFS-11384
> URL: https://issues.apache.org/jira/browse/HDFS-11384
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: balancer & mover
> Affects Versions: 2.7.3
> Reporter: yunjiong zhao
> Assignee: yunjiong zhao
> Attachments: balancer.day.png, balancer.week.png,
> HDFS-11384.001.patch, HDFS-11384.002.patch
>
>
> When running balancer on hadoop cluster which have more than 3000 Datanodes
> will cause NameNode's rpc.CallQueueLength spike. We observed this situation
> could cause Hbase cluster failure due to RegionServer's WAL timeout.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]