[jira] [Created] (HDFS-7137) HDFS Federation -- Adding a new Namenode to an existing HDFS cluster Document Has an Error
zhangyubiao created HDFS-7137: - Summary: HDFS Federation -- Adding a new Namenode to an existing HDFS cluster Document Has an Error Key: HDFS-7137 URL: https://issues.apache.org/jira/browse/HDFS-7137 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.5.1 Reporter: zhangyubiao Priority: Minor Fix For: 2.5.1 In Document HDFS Federation -- Adding a new Namenode to an existing HDFS cluster $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode datanode_host_name:datanode_rpc_port should be $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNameNode datanode_host_name:datanode_rpc_port It just miss s in dfadmin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7137) HDFS Federation -- Adding a new Namenode to an existing HDFS cluster Document Has an Error
[ https://issues.apache.org/jira/browse/HDFS-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-7137: -- Status: Patch Available (was: Open) HDFS Federation -- Adding a new Namenode to an existing HDFS cluster Document Has an Error Key: HDFS-7137 URL: https://issues.apache.org/jira/browse/HDFS-7137 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 2.5.1 Reporter: zhangyubiao Priority: Minor Labels: documentation Fix For: 2.5.1 In Document HDFS Federation -- Adding a new Namenode to an existing HDFS cluster $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode datanode_host_name:datanode_rpc_port should be $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNameNode datanode_host_name:datanode_rpc_port It just miss s in dfadmin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
[ https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15318664#comment-15318664 ] zhangyubiao commented on HDFS-10373: Thank you very much [~cnauroth]. I will sending an email to u...@hadoop.apache.org. I wonder to konw i set the network config right for the namenode. I have no experience in operating system network. By the way is it we can use the hdfs-audit.log by real time parse to see there has the abnormal job? Thank you once again > HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver > > > Key: HDFS-10373 > URL: https://issues.apache.org/jira/browse/HDFS-10373 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.2.0 > Environment: CentOS6.5 Hadoop-2.2.0 >Reporter: zhangyubiao > Attachments: screenshot-1.png, 屏幕快照_2016-05-06_上午10.17.22.png > > > HDFS ZKFC HealthMonitor Throw a Exception > 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: > Transport-level exception trying to monitor health of NameNode at > XXX-XXX-XXX-hadoop.jd.local/172.22.17 > 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset > by peer; Host Details : local host is: > "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat > ion host is: XXX-XXX-XXX-hadoop.jd.local":8021; > Cause HA AutoFailOver -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230045#comment-15230045 ] zhangyubiao commented on HDFS-10246: [~mingma],thank you. I disable retry cache and set the dfs.namenode.checkpoint.txns to 1000,The standby become normal. But I find the performance spikes still exits in every three min . And we have dfs.ha.tail-edits.period value 60s and the dfs.ha.log-roll.period value 120s. Should we reduce the value ? I Update the Rpc CallQueueLength image with standby > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10246: --- Attachment: 屏幕快照 2016-04-07 下午6.04.36.png > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221798#comment-15221798 ] zhangyubiao commented on HDFS-10246: [~mingma] Is it the same as https://issues.apache.org/jira/browse/HDFS-7609 , Would you like give a look ? > HDFS Standby NameNode dfshealth.jsp Response very slow > -- > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221814#comment-15221814 ] zhangyubiao commented on HDFS-10246: [~vinayrpet] I konw the latest versions of hadoop replaced by new HTML5 webUI dfshealth.html . But We still use Hadoop-2.2.0. We open the jmx metrics and find it very slow too. And sometimes the Active NameNode lost the node and recorvery in a few times . I update the stacks for Standby NameNode > HDFS Standby NameNode dfshealth.jsp Response very slow > -- > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow
zhangyubiao created HDFS-10246: -- Summary: HDFS Standby NameNode dfshealth.jsp Response very slow Key: HDFS-10246 URL: https://issues.apache.org/jira/browse/HDFS-10246 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Environment: CentOS6.3 Hadoop-2.2.0 Reporter: zhangyubiao HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10246: --- Attachment: stacks.txt > HDFS Standby NameNode dfshealth.jsp Response very slow > -- > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10246: --- Summary: Standby NameNode dfshealth.jsp Response very slow (was: HDFS Standby NameNode dfshealth.jsp Response very slow ) > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10249) ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly
[ https://issues.apache.org/jira/browse/HDFS-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10249: --- Summary: ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly (was: HDFS ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly) > ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly > > > Key: HDFS-10249 > URL: https://issues.apache.org/jira/browse/HDFS-10249 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > > HDFS Active NameNode downloadImageToStorage Cause Rpc Respone Slowly Cause > DataNode Apparent death, And The Datanode recovery when the fsimage download > finish -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10249) HDFS ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly
zhangyubiao created HDFS-10249: -- Summary: HDFS ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly Key: HDFS-10249 URL: https://issues.apache.org/jira/browse/HDFS-10249 Project: Hadoop HDFS Issue Type: Bug Components: namenode Environment: CentOS6.3 Hadoop-2.2.0 Reporter: zhangyubiao HDFS Active NameNode downloadImageToStorage Cause Rpc Respone Slowly Cause DataNode Apparent death, And The Datanode recovery when the fsimage download finish -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1537#comment-1537 ] zhangyubiao commented on HDFS-10246: [~mingma],thanks you > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222629#comment-15222629 ] zhangyubiao commented on HDFS-10246: Or I can reduce dfs.namenode.retrycache.expirytime.millis time value and increase dfs.namenode.retrycache.heap.percent value to ease this problem ? > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222592#comment-15222592 ] zhangyubiao commented on HDFS-10246: [~mingma],is it OK for set the disable retry cache for standby ? > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292964#comment-15292964 ] zhangyubiao commented on HDFS-10246: [~kihwal] thanks. it really helpful for us. > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10246: --- External issue ID: (was: HDFS-7609) > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10246: --- External issue ID: HDFS-7609 > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow
[ https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao resolved HDFS-10246. Resolution: Duplicate > Standby NameNode dfshealth.jsp Response very slow > - > > Key: HDFS-10246 > URL: https://issues.apache.org/jira/browse/HDFS-10246 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.3 Hadoop-2.2.0 >Reporter: zhangyubiao > Labels: bug > Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png > > > HDFS Standby NameNode dfshealth.jsp Response very slow -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
zhangyubiao created HDFS-10373: -- Summary: HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver Key: HDFS-10373 URL: https://issues.apache.org/jira/browse/HDFS-10373 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover Affects Versions: 2.2.0 Environment: CentOS Reporter: zhangyubiao HDFS ZKFC HealthMonitor Throw a Exception 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: Transport-level exception trying to monitor health of NameNode at XXX-XXX-XXX-hadoop.jd.local/172.22.17 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset by peer; Host Details : local host is: "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat ion host is: XXX-XXX-XXX-hadoop.jd.local":8021; Cause HA AutoFailOver -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
[ https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10373: --- Attachment: screenshot-1.png > HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver > > > Key: HDFS-10373 > URL: https://issues.apache.org/jira/browse/HDFS-10373 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.2.0 > Environment: CentOS >Reporter: zhangyubiao > Attachments: screenshot-1.png > > > HDFS ZKFC HealthMonitor Throw a Exception > 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: > Transport-level exception trying to monitor health of NameNode at > XXX-XXX-XXX-hadoop.jd.local/172.22.17 > 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset > by peer; Host Details : local host is: > "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat > ion host is: XXX-XXX-XXX-hadoop.jd.local":8021; > Cause HA AutoFailOver -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
[ https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10373: --- Attachment: 屏幕快照_2016-05-06_上午10.17.22.png > HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver > > > Key: HDFS-10373 > URL: https://issues.apache.org/jira/browse/HDFS-10373 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.2.0 > Environment: CentOS >Reporter: zhangyubiao > Attachments: screenshot-1.png, 屏幕快照_2016-05-06_上午10.17.22.png > > > HDFS ZKFC HealthMonitor Throw a Exception > 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: > Transport-level exception trying to monitor health of NameNode at > XXX-XXX-XXX-hadoop.jd.local/172.22.17 > 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset > by peer; Host Details : local host is: > "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat > ion host is: XXX-XXX-XXX-hadoop.jd.local":8021; > Cause HA AutoFailOver -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
[ https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-10373: --- Environment: CentOS6.5 Hadoop-2.2.0(was: CentOS) > HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver > > > Key: HDFS-10373 > URL: https://issues.apache.org/jira/browse/HDFS-10373 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.2.0 > Environment: CentOS6.5 Hadoop-2.2.0 >Reporter: zhangyubiao > Attachments: screenshot-1.png, 屏幕快照_2016-05-06_上午10.17.22.png > > > HDFS ZKFC HealthMonitor Throw a Exception > 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: > Transport-level exception trying to monitor health of NameNode at > XXX-XXX-XXX-hadoop.jd.local/172.22.17 > 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset > by peer; Host Details : local host is: > "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat > ion host is: XXX-XXX-XXX-hadoop.jd.local":8021; > Cause HA AutoFailOver -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11319) HDFS FSNamesystem LeaseManager.findPath BLOCK ALL FSNamesystem Ops
[ https://issues.apache.org/jira/browse/HDFS-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangyubiao updated HDFS-11319: --- Issue Type: Sub-task (was: Bug) Parent: HDFS-11318 > HDFS FSNamesystem LeaseManager.findPath BLOCK ALL FSNamesystem Ops > -- > > Key: HDFS-11319 > URL: https://issues.apache.org/jira/browse/HDFS-11319 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS6.5 Hadoop-2.2.0 >Reporter: zhangyubiao >Priority: Critical > > "IPC Server handler 69 on 8021" daemon prio=10 tid=0x7f0714c59000 > nid=0x17a23 runnable [0x7eee3ec2f000] >java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.hdfs.server.namenode.INode.compareTo(INode.java:641) > at org.apache.hadoop.hdfs.server.namenode.INode.compareTo(INode.java:52) > at > org.apache.hadoop.hdfs.util.ReadOnlyList$Util.binarySearch(ReadOnlyList.java:73) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.getChild(INodeDirectory.java:323) > at > org.apache.hadoop.hdfs.server.namenode.INodesInPath.resolve(INodesInPath.java:216) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.getLastINodeInPath(INodeDirectory.java:330) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.getLastINodeInPath(FSDirectory.java:1655) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.getINode(FSDirectory.java:1645) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager$Lease.findPath(LeaseManager.java:259) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager$Lease.access$300(LeaseManager.java:228) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.findPath(LeaseManager.java:189) > - locked <0x7ef67f8fe698> (a > org.apache.hadoop.hdfs.server.namenode.LeaseManager) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4020) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:3989) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:647) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:241) > at > org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:24093) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11326) FSNamesystem closeFileCommitBlocks block FSNamesystem Ops
[ https://issues.apache.org/jira/browse/HDFS-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134650#comment-16134650 ] zhangyubiao commented on HDFS-11326: Thanks,[~arpitagarwal]. It happened in the network problem. I create the issue but can't see it . so it repeat in issue. > FSNamesystem closeFileCommitBlocks block FSNamesystem Ops > - > > Key: HDFS-11326 > URL: https://issues.apache.org/jira/browse/HDFS-11326 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.2.0 > Environment: CentOS 6.5 Hadoop-2.2.0 >Reporter: zhangyubiao >Priority: Critical > > Seems like String src = leaseManager.findPath(pendingFile); cause to much > time hold the write lock. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-5970) callers of NetworkTopology's chooseRandom method to expect null return value
[ https://issues.apache.org/jira/browse/HDFS-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021654#comment-16021654 ] zhangyubiao commented on HDFS-5970: --- [~olegd], what action you do to reproduced ? > callers of NetworkTopology's chooseRandom method to expect null return value > > > Key: HDFS-5970 > URL: https://issues.apache.org/jira/browse/HDFS-5970 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Yongjun Zhang >Priority: Minor > > Class NetworkTopology's method >public Node chooseRandom(String scope) > calls >private Node chooseRandom(String scope, String excludedScope) > which may return null value. > Callers of this method such as BlockPlacementPolicyDefault etc need to be > aware that. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org