[jira] [Created] (HDFS-7137) HDFS Federation -- Adding a new Namenode to an existing HDFS cluster Document Has an Error

2014-09-23 Thread zhangyubiao (JIRA)
zhangyubiao created HDFS-7137:
-

 Summary: HDFS Federation  -- Adding a new Namenode to an existing 
HDFS cluster Document Has an Error 
 Key: HDFS-7137
 URL: https://issues.apache.org/jira/browse/HDFS-7137
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.5.1
Reporter: zhangyubiao
Priority: Minor
 Fix For: 2.5.1


In Document 
HDFS Federation  -- Adding a new Namenode to an existing HDFS cluster
 $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode 
 datanode_host_name:datanode_rpc_port 
should be 
 $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNameNode 
 datanode_host_name:datanode_rpc_port 

It just miss s in dfadmin 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7137) HDFS Federation -- Adding a new Namenode to an existing HDFS cluster Document Has an Error

2014-09-23 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-7137:
--
Status: Patch Available  (was: Open)

 HDFS Federation  -- Adding a new Namenode to an existing HDFS cluster 
 Document Has an Error 
 

 Key: HDFS-7137
 URL: https://issues.apache.org/jira/browse/HDFS-7137
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.5.1
Reporter: zhangyubiao
Priority: Minor
  Labels: documentation
 Fix For: 2.5.1


 In Document 
 HDFS Federation  -- Adding a new Namenode to an existing HDFS cluster
  $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode 
  datanode_host_name:datanode_rpc_port 
 should be 
  $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNameNode 
  datanode_host_name:datanode_rpc_port 
 It just miss s in dfadmin 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver

2016-06-07 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15318664#comment-15318664
 ] 

zhangyubiao commented on HDFS-10373:


Thank you very much  [~cnauroth].  I  will  sending an email to 
u...@hadoop.apache.org.  I wonder to konw i set the network config right for 
the namenode. I have no  experience in operating system network.  
By the way is it we can use the  hdfs-audit.log by real time parse to see there 
has the  abnormal job? Thank you once again

> HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
> 
>
> Key: HDFS-10373
> URL: https://issues.apache.org/jira/browse/HDFS-10373
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.2.0
> Environment: CentOS6.5 Hadoop-2.2.0  
>Reporter: zhangyubiao
> Attachments: screenshot-1.png, 屏幕快照_2016-05-06_上午10.17.22.png
>
>
> HDFS ZKFC HealthMonitor Throw a Exception 
> 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: 
> Transport-level exception trying to monitor health of NameNode at 
> XXX-XXX-XXX-hadoop.jd.local/172.22.17
> 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset 
> by peer; Host Details : local host is: 
> "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat
> ion host is: XXX-XXX-XXX-hadoop.jd.local":8021;
> Cause HA AutoFailOver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-04-07 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230045#comment-15230045
 ] 

zhangyubiao commented on HDFS-10246:


[~mingma],thank you.  I disable retry cache and set the 
dfs.namenode.checkpoint.txns to 1000,The standby become normal.
But I find the performance spikes still exits  in every three min .  
And we have dfs.ha.tail-edits.period value 60s and
 the dfs.ha.log-roll.period value 120s.   Should we reduce the value ?

I Update the Rpc CallQueueLength image with standby 


> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-04-07 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10246:
---
Attachment: 屏幕快照 2016-04-07 下午6.04.36.png

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221798#comment-15221798
 ] 

zhangyubiao commented on HDFS-10246:


[~mingma]   Is it  the same as https://issues.apache.org/jira/browse/HDFS-7609 
, Would  you like  give a look ? 

> HDFS Standby NameNode  dfshealth.jsp   Response very slow 
> --
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221814#comment-15221814
 ] 

zhangyubiao commented on HDFS-10246:


[~vinayrpet]   I konw the  latest versions of hadoop  replaced by new HTML5 
webUI dfshealth.html .  
But We still use Hadoop-2.2.0.  We open the jmx metrics and find it very slow 
too. 
 And sometimes the Active NameNode lost the node and recorvery in a few times . 
 I update the stacks for Standby NameNode 

> HDFS Standby NameNode  dfshealth.jsp   Response very slow 
> --
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)
zhangyubiao created HDFS-10246:
--

 Summary: HDFS Standby NameNode  dfshealth.jsp   Response very slow 
 Key: HDFS-10246
 URL: https://issues.apache.org/jira/browse/HDFS-10246
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0
 Environment: CentOS6.3  Hadoop-2.2.0 
Reporter: zhangyubiao


HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10246) HDFS Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10246:
---
Attachment: stacks.txt

> HDFS Standby NameNode  dfshealth.jsp   Response very slow 
> --
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10246:
---
Summary: Standby NameNode  dfshealth.jsp   Response very slow   (was: HDFS 
Standby NameNode  dfshealth.jsp   Response very slow )

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10249) ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly

2016-04-01 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10249:
---
Summary: ActiveNameNode downloadImageToStorage Cause Rpc  Respone  Slowly  
(was: HDFS ActiveNameNode downloadImageToStorage Cause Rpc  Respone  Slowly)

> ActiveNameNode downloadImageToStorage Cause Rpc  Respone  Slowly
> 
>
> Key: HDFS-10249
> URL: https://issues.apache.org/jira/browse/HDFS-10249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
> Environment: CentOS6.3 Hadoop-2.2.0 
>Reporter: zhangyubiao
>
> HDFS Active NameNode downloadImageToStorage Cause Rpc  Respone  Slowly  Cause 
> DataNode Apparent death,  And The Datanode recovery when the fsimage download 
> finish 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10249) HDFS ActiveNameNode downloadImageToStorage Cause Rpc Respone Slowly

2016-04-01 Thread zhangyubiao (JIRA)
zhangyubiao created HDFS-10249:
--

 Summary: HDFS ActiveNameNode downloadImageToStorage Cause Rpc  
Respone  Slowly
 Key: HDFS-10249
 URL: https://issues.apache.org/jira/browse/HDFS-10249
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
 Environment: CentOS6.3 Hadoop-2.2.0 
Reporter: zhangyubiao


HDFS Active NameNode downloadImageToStorage Cause Rpc  Respone  Slowly  Cause 
DataNode Apparent death,  And The Datanode recovery when the fsimage download 
finish 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1537#comment-1537
 ] 

zhangyubiao commented on HDFS-10246:


[~mingma],thanks you 

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222629#comment-15222629
 ] 

zhangyubiao commented on HDFS-10246:


Or I can reduce  dfs.namenode.retrycache.expirytime.millis time  value 
and increase dfs.namenode.retrycache.heap.percent  value 
to ease this problem ?

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-04-01 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222592#comment-15222592
 ] 

zhangyubiao commented on HDFS-10246:


[~mingma],is it OK for set the disable retry cache for standby ?

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-05-20 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292964#comment-15292964
 ] 

zhangyubiao commented on HDFS-10246:


[~kihwal]   thanks. it really helpful  for us. 

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-05-19 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10246:
---
External issue ID:   (was: HDFS-7609)

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-05-19 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10246:
---
External issue ID: HDFS-7609

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10246) Standby NameNode dfshealth.jsp Response very slow

2016-05-19 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao resolved HDFS-10246.

Resolution: Duplicate

> Standby NameNode  dfshealth.jsp   Response very slow 
> -
>
> Key: HDFS-10246
> URL: https://issues.apache.org/jira/browse/HDFS-10246
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.3  Hadoop-2.2.0 
>Reporter: zhangyubiao
>  Labels: bug
> Attachments: stacks.txt, 屏幕快照 2016-04-07 下午6.04.36.png
>
>
> HDFS Standby NameNode  dfshealth.jsp   Response very slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver

2016-05-06 Thread zhangyubiao (JIRA)
zhangyubiao created HDFS-10373:
--

 Summary: HDFS ZKFC HealthMonitor Throw a Exception Cause 
AutoFailOver
 Key: HDFS-10373
 URL: https://issues.apache.org/jira/browse/HDFS-10373
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover
Affects Versions: 2.2.0
 Environment: CentOS
Reporter: zhangyubiao


HDFS ZKFC HealthMonitor Throw a Exception 
2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: 
Transport-level exception trying to monitor health of NameNode at 
XXX-XXX-XXX-hadoop.jd.local/172.22.17
1.XX:8021: Failed on local exception: java.io.IOException: Connection reset by 
peer; Host Details : local host is: 
"XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat
ion host is: XXX-XXX-XXX-hadoop.jd.local":8021;

Cause HA AutoFailOver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver

2016-05-06 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10373:
---
Attachment: screenshot-1.png

> HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
> 
>
> Key: HDFS-10373
> URL: https://issues.apache.org/jira/browse/HDFS-10373
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.2.0
> Environment: CentOS
>Reporter: zhangyubiao
> Attachments: screenshot-1.png
>
>
> HDFS ZKFC HealthMonitor Throw a Exception 
> 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: 
> Transport-level exception trying to monitor health of NameNode at 
> XXX-XXX-XXX-hadoop.jd.local/172.22.17
> 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset 
> by peer; Host Details : local host is: 
> "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat
> ion host is: XXX-XXX-XXX-hadoop.jd.local":8021;
> Cause HA AutoFailOver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver

2016-05-06 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10373:
---
Attachment: 屏幕快照_2016-05-06_上午10.17.22.png

> HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
> 
>
> Key: HDFS-10373
> URL: https://issues.apache.org/jira/browse/HDFS-10373
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.2.0
> Environment: CentOS
>Reporter: zhangyubiao
> Attachments: screenshot-1.png, 屏幕快照_2016-05-06_上午10.17.22.png
>
>
> HDFS ZKFC HealthMonitor Throw a Exception 
> 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: 
> Transport-level exception trying to monitor health of NameNode at 
> XXX-XXX-XXX-hadoop.jd.local/172.22.17
> 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset 
> by peer; Host Details : local host is: 
> "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat
> ion host is: XXX-XXX-XXX-hadoop.jd.local":8021;
> Cause HA AutoFailOver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10373) HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver

2016-05-06 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-10373:
---
Environment: CentOS6.5 Hadoop-2.2.0(was: CentOS)

> HDFS ZKFC HealthMonitor Throw a Exception Cause AutoFailOver
> 
>
> Key: HDFS-10373
> URL: https://issues.apache.org/jira/browse/HDFS-10373
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.2.0
> Environment: CentOS6.5 Hadoop-2.2.0  
>Reporter: zhangyubiao
> Attachments: screenshot-1.png, 屏幕快照_2016-05-06_上午10.17.22.png
>
>
> HDFS ZKFC HealthMonitor Throw a Exception 
> 2016-05-05 02:00:59,475 WARN org.apache.hadoop.ha.HealthMonitor: 
> Transport-level exception trying to monitor health of NameNode at 
> XXX-XXX-XXX-hadoop.jd.local/172.22.17
> 1.XX:8021: Failed on local exception: java.io.IOException: Connection reset 
> by peer; Host Details : local host is: 
> "XXX-XXX-XXX-hadoop.jd.local/172.22.171.XX"; destinat
> ion host is: XXX-XXX-XXX-hadoop.jd.local":8021;
> Cause HA AutoFailOver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11319) HDFS FSNamesystem LeaseManager.findPath BLOCK ALL FSNamesystem Ops

2017-01-13 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated HDFS-11319:
---
Issue Type: Sub-task  (was: Bug)
Parent: HDFS-11318

> HDFS FSNamesystem LeaseManager.findPath BLOCK ALL FSNamesystem Ops
> --
>
> Key: HDFS-11319
> URL: https://issues.apache.org/jira/browse/HDFS-11319
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS6.5 Hadoop-2.2.0 
>Reporter: zhangyubiao
>Priority: Critical
>
> "IPC Server handler 69 on 8021" daemon prio=10 tid=0x7f0714c59000 
> nid=0x17a23 runnable [0x7eee3ec2f000]
>java.lang.Thread.State: RUNNABLE
> at org.apache.hadoop.hdfs.server.namenode.INode.compareTo(INode.java:641)
> at org.apache.hadoop.hdfs.server.namenode.INode.compareTo(INode.java:52)
> at 
> org.apache.hadoop.hdfs.util.ReadOnlyList$Util.binarySearch(ReadOnlyList.java:73)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.getChild(INodeDirectory.java:323)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath.resolve(INodesInPath.java:216)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.getLastINodeInPath(INodeDirectory.java:330)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getLastINodeInPath(FSDirectory.java:1655)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getINode(FSDirectory.java:1645)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Lease.findPath(LeaseManager.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Lease.access$300(LeaseManager.java:228)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.findPath(LeaseManager.java:189)
> - locked <0x7ef67f8fe698> (a 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4020)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:3989)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:647)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:241)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:24093)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11326) FSNamesystem closeFileCommitBlocks block FSNamesystem Ops

2017-08-20 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134650#comment-16134650
 ] 

zhangyubiao commented on HDFS-11326:


Thanks,[~arpitagarwal].  It happened in the network problem.  I create the 
issue but can't see it . so it repeat in issue. 

> FSNamesystem closeFileCommitBlocks block FSNamesystem Ops
> -
>
> Key: HDFS-11326
> URL: https://issues.apache.org/jira/browse/HDFS-11326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
> Environment: CentOS 6.5 Hadoop-2.2.0 
>Reporter: zhangyubiao
>Priority: Critical
>
> Seems  like String src = leaseManager.findPath(pendingFile); cause to much 
> time hold the write lock.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-5970) callers of NetworkTopology's chooseRandom method to expect null return value

2017-05-23 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021654#comment-16021654
 ] 

zhangyubiao commented on HDFS-5970:
---

[~olegd],  what action you do to reproduced ?

> callers of NetworkTopology's chooseRandom method to expect null return value
> 
>
> Key: HDFS-5970
> URL: https://issues.apache.org/jira/browse/HDFS-5970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yongjun Zhang
>Priority: Minor
>
> Class NetworkTopology's method
>public Node chooseRandom(String scope) 
> calls 
>private Node chooseRandom(String scope, String excludedScope)
> which may return null value.
> Callers of this method such as BlockPlacementPolicyDefault etc need to be 
> aware that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org