[jira] [Commented] (HDFS-13183) Standby NameNode process getBlocks request to reduce Active load

2018-07-15 Thread xiaoli (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544463#comment-16544463
 ] 

xiaoli commented on HDFS-13183:
---

(y)

> Standby NameNode process getBlocks request to reduce Active load
> 
>
> Key: HDFS-13183
> URL: https://issues.apache.org/jira/browse/HDFS-13183
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, namenode
>Affects Versions: 2.7.5, 3.1.0, 2.9.1, 2.8.4, 3.0.2
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-13183-trunk.001.patch, HDFS-13183-trunk.002.patch, 
> HDFS-13183-trunk.003.patch
>
>
> The performance of Active NameNode could be impact when {{Balancer}} requests 
> #getBlocks, since query blocks of overly full DNs performance is extremely 
> inefficient currently. The main reason is {{NameNodeRpcServer#getBlocks}} 
> hold read lock for long time. In extreme case, all handlers of Active 
> NameNode RPC server are occupied by one reader 
> {{NameNodeRpcServer#getBlocks}} and other write operation calls, thus Active 
> NameNode enter a state of false death for number of seconds even for minutes.
> The similar performance concerns of Balancer have reported by HDFS-9412, 
> HDFS-7967, etc.
> If Standby NameNode can shoulder #getBlocks heavy burden, it could speed up 
> the progress of balancing and reduce performance impact to Active NameNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10290) Move getBlocks calls to DataNode in Balancer

2018-05-06 Thread xiaoli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465002#comment-16465002
 ] 

xiaoli commented on HDFS-10290:
---

try:https://issues.apache.org/jira/browse/HDFS-13183

> Move getBlocks calls to DataNode in Balancer
> 
>
> Key: HDFS-10290
> URL: https://issues.apache.org/jira/browse/HDFS-10290
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer  mover
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Priority: Major
>
> In current implementation, Balancer asks NameNode for a list of blocks on 
> specific DataNode. This made workload of NameNode heavier, and actually it 
> caused NameNode flappy when average # of blocks on each DataNode reaches 
> 1,000,000 (NameNode heap size is 192GB, cpu: Xeon E5-2630 * 2).
> Recently I investigated whether {{getBlocks}} invocation from Balancer can be 
> handled by DataNodes, turned out to be practical. 
> The only pitfall is: since DataNode has no information about other locations 
> of each block it possesses, some block move may fail (since target node may 
> already has a replica of that particular block).
> I think this may be beneficial for large clusters.
> Any suggestions or comments?
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike

2017-10-12 Thread xiaoli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201578#comment-16201578
 ] 

xiaoli commented on HDFS-11384:
---

The patch1 looks good!(/)(/)

> Add option for balancer to disperse getBlocks calls to avoid NameNode's 
> rpc.CallQueueLength spike
> -
>
> Key: HDFS-11384
> URL: https://issues.apache.org/jira/browse/HDFS-11384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.3
>Reporter: yunjiong zhao
>Assignee: Konstantin Shvachko
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11384-007.patch, HDFS-11384-branch-2.7.011.patch, 
> HDFS-11384-branch-2.8.011.patch, HDFS-11384.001.patch, HDFS-11384.002.patch, 
> HDFS-11384.003.patch, HDFS-11384.004.patch, HDFS-11384.005.patch, 
> HDFS-11384.006.patch, HDFS-11384.008.patch, HDFS-11384.009.patch, 
> HDFS-11384.010.patch, HDFS-11384.011.patch, balancer.day.png, 
> balancer.week.png
>
>
> When running balancer on hadoop cluster which have more than 3000 Datanodes 
> will cause NameNode's rpc.CallQueueLength spike. We observed this situation 
> could cause Hbase cluster failure due to RegionServer's WAL timeout.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org