[
https://issues.apache.org/jira/browse/HDFS-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15414:
-
Description:
We observed this exception in a DataNode's log while we are not shutting down
any nodes in the
[
https://issues.apache.org/jira/browse/HDFS-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15414:
-
Description:
We observed this exception in a DataNode's log while we are not shutting down
any nodes in the
YCozy created HDFS-15414:
Summary: java.net.SocketException: Original Exception :
java.io.IOException: Broken pipe
Key: HDFS-15414
URL: https://issues.apache.org/jira/browse/HDFS-15414
Project: Hadoop HDFS
YCozy created HDFS-15367:
Summary: Fail to get file checksum even if there's an available
replica.
Key: HDFS-15367
URL: https://issues.apache.org/jira/browse/HDFS-15367
Project: Hadoop HDFS
Issue
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072805#comment-17072805
]
YCozy commented on HDFS-15235:
--
Hello [~weichiu], thanks for looking into this!
For triggering this bug,
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071727#comment-17071727
]
YCozy commented on HDFS-15235:
--
Hello [~weichiu], would you please help review the patch? Thanks!
>
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071006#comment-17071006
]
YCozy commented on HDFS-15235:
--
Hello [~ayushtkn], I've made the title more accurate. Could you please help
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069462#comment-17069462
]
YCozy commented on HDFS-15235:
--
Also, NN2 shouldn't be killed because the fencing should be invoked only
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069461#comment-17069461
]
YCozy commented on HDFS-15235:
--
Thank you [~ayushtkn]! Upon further analysis we found that NN1 did become
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15235:
-
Summary: Transient network failure during NameNode failover kills the
NameNode (was: Transient network failure
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069442#comment-17069442
]
YCozy commented on HDFS-15235:
--
[~hemanthboyina], [~elgoiri], would you be so kind to help review the patch?
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067649#comment-17067649
]
YCozy commented on HDFS-15235:
--
[~ayushtkn], could you please take a look at the patch? Thanks!
> Transient
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15235:
-
Attachment: HDFS-15235.001.patch
Status: Patch Available (was: Open)
Attaching a patch with both the UT
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066043#comment-17066043
]
YCozy commented on HDFS-15235:
--
[~ayushtkn] Thanks for looking at this! I'll try to upload a UT and a fix.
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066042#comment-17066042
]
YCozy commented on HDFS-15235:
--
A bit more info:
After NN2 fails to send back a response, haadmin first
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15235:
-
Description:
We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At
some point, NN1
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15235:
-
Description:
We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At
some point, NN1
[
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YCozy updated HDFS-15235:
-
Description:
We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At
some point, NN1
YCozy created HDFS-15235:
Summary: Transient network failure during NameNode failover makes
cluster unavailable
Key: HDFS-15235
URL: https://issues.apache.org/jira/browse/HDFS-15235
Project: Hadoop HDFS
19 matches
Mail list logo