[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2018-05-31 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6440:

Hadoop Flags: Incompatible change,Reviewed  (was: Reviewed)

> Support more than 2 NameNodes
> -
>
> Key: HDFS-6440
> URL: https://issues.apache.org/jira/browse/HDFS-6440
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: auto-failover, ha, namenode
>Affects Versions: 2.4.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Major
> Fix For: 3.0.0-alpha1
>
> Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
> hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
> hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2016-03-01 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6440:
--
Release Note: This feature adds support for running additional standby 
NameNodes, which provides additional fault-tolerance. It is designed for a 
total of 3-5 NameNodes.

> Support more than 2 NameNodes
> -
>
> Key: HDFS-6440
> URL: https://issues.apache.org/jira/browse/HDFS-6440
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: auto-failover, ha, namenode
>Affects Versions: 2.4.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 3.0.0
>
> Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
> hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
> hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Auto-Re: [jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-06-23 Thread wsb
您的邮件已收到!谢谢!

[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-06-23 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-6440:
-
  Resolution: Fixed
Target Version/s: 3.0.0  (was: 2.6.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I've just committed this change to trunk.

Thanks a lot for the monster contribution, Jesse. Thanks also very much to Eddy 
for doing a bunch of initial reviews, and to Lars for keeping on me to review 
this patch. :)

[~jesse_yates] - mind filing a follow-up JIRA to amend the docs appropriately?

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
 hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-06-18 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Status: Open  (was: Patch Available)

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
 hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-06-18 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Status: Patch Available  (was: Open)

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
 hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-06-18 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v8.patch

Attaching updated patch w/ whitespace fix. Lets see what QA thinks of the 
upgrade test.

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
 hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-05-28 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v6.patch

New version, hopefully fixing the findbugs/checkstyle issues and increasing the 
TestPipelinesFailover timeout to get it to pass.

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, 
 hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-05-28 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v7.patch

Ok, looks like didn't fix whitespace like I thought :-/ However, manually fixed 
up checkstyle/whitespace issues. Also, slight improvement in 
TestPipelinesFailover to abstract cluster creation b/c rebase failed to update 
all relevant tests to run 3 NNs, causing periodic test failures. Now passing 
every time locally.

Hopefully, this should get the greenlight from QA :)

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
 hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-05-27 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v5.patch

Attaching updated patch, rebased on latest trunk. My usual covering suite of 
mNN tests* passed locally a few times.

Notable changes:
- Moving checkpoint lock inside actually needing to take the checkpoint (not 
functional change, just a locking improvement)
- Cleanup determining when to send checkpoints, so we only calculate if we 
should send it when we know that the checkpoint will actually be created.

{code}
*mvn clean test 
-Dtest=TestPipelinesFailover,TestRollingUpgrade,TestZKFailoverController,TestBookKeeperHACheckpoints,TestBlockToken,TestBackupNode,TestCheckpoint,TestDFSUpgradeFromImage,TestBootstrapStandby,TestBootstrapStandbyWithQJM,TestEditLogTailer,TestFailoverWithBlockTokens,TestHAConfiguration,TestRemoteNameNodeInfo,TestSeveralNameNodes,TestStandbyCheckpoints,TestDNFencingWithReplication
{code}

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-6440-trunk-v5.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-05-21 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v4.patch

Attaching updated patch. Working through some local test failures - seem like 
they might be just due to rebase changes? Looking into it.

Changes of note:
* fixing concurrent checkpoint management - was breaking TestRollingUpgrade - 
to not keep around completed checkpoints
* Adding tests to TestRollingUpgrade
 Removing random seed setting in testpipelinesfailover
* Fixing startup option setting in minidfscluster#restartNode
* Fixing block manager to use correct nnid lookup

FYI, on vacation through memorial day, so not going to be doing much for the 
next few days. Back on Tuesday.

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
 hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-05-06 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v3.patch

Attaching patch updated on trunk + [~atm]'s comments (less ones that didn't 
seem to apply). Haven't run local tests since changes seemed innocuous... 
hoping that HadoopQA bot can handle this on it own. 

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, 
 hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-05-06 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 3.0.0

 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, 
 hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2015-01-17 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v1.patch

Attaching patch addressing round 2 of comments. Thanks for the feedback - its 
getting better every round!

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-6440-trunk-v1.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2014-12-16 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: hdfs-6440-trunk-v1.patch

updated version of the patch as per excellent review comments (thanks 
[~eddyxu]!). It will probably need another rebase before it goes in as well, 
but for the moment I wanted to minimize the deltas until everyone is happy.

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
 hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2014-10-27 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--
Attachment: Multiple-Standby-NameNodes_V1.pdf
hdfs-multiple-snn-trunk-v0.patch

Attaching a patch on top of trunk (at least as of a couple weeks ago).

Also, attaching a design doc as a guide for anyone who wants to take on 
reviewing this one :)

FWIW, we are running this patch in production at Salesforce(1), added 
additional unit tests that pass alongside the original unit tests, and did an 
extensive load testing under adverse conditions via m/r (see design doc).

(1) well, on top of the latest CDH release :)

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: Multiple-Standby-NameNodes_V1.pdf, 
 hdfs-6440-cdh-4.5-full.patch, hdfs-multiple-snn-trunk-v0.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2014-07-02 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated HDFS-6440:
---

Target Version/s: 2.6.0  (was: 2.5.0)

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hdfs-6440-cdh-4.5-full.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2014-05-20 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HDFS-6440:
--

Attachment: hdfs-6440-cdh-4.5-full.patch

Attaching patch for CDH 4.5.0, since this is what we run on at Salesforce. I'll 
update to the proper open source branches once I've got some consensus that 
this is the 'right' way to go about doing these changes.

For what its worth, all the unit tests have passed (at one point.. they are a 
bit flaky :)) and we've been doing some m/r based load tests with a chaos 
monkey(1) and have been successful (2).

As mentioned in the issue description, there is a majority of the complexity in 
the checkpointing. For this, I went with a 'first writer wins' approach. From 
the standpoint of the standby node, if you're checkpoint isn't accepted (the 
other NN got one there first) then you back-off for 2x the usual wait time 
before trying to send it again. I had to add another response code to the 
GetImageServlet to support the 'someone else won' logic - its not the cleanest 
solution as other HTTP response codes fit better, but they are already being 
used to indicate other failure cases.

Other notable changes:
 - EditLogTailer checks all NN when rolling logs
 - BootstapSTanby uses all namenodes when attempting bootstrap
 - update block token creation to segment integer space by NN id
 - updating NN dir creation to include ns index (3)
 - updated a lot of the tests to support testing across all the NNs, including 
HAStressTestHarness, and a circular linked list writing test
 - moved to using a multi-map of NNs in MiniDFSCluster as they are no longer 
limited to two NNs.

(1) each mapper writes a linked list of files, then ensures it can read it back
(2) required a bit of tuning to ride over reconnections once we started killing 
NNs more than every 60 seconds
(3) Not sure the best way to update the tests for this. Right now made some 
changes to TestDFSUpgradeFromImage, but that might need a little rework.

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Reporter: Jesse Yates
 Attachments: hdfs-6440-cdh-4.5-full.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6440) Support more than 2 NameNodes

2014-05-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-6440:
-

 Target Version/s: 2.5.0
Affects Version/s: 2.4.0

 Support more than 2 NameNodes
 -

 Key: HDFS-6440
 URL: https://issues.apache.org/jira/browse/HDFS-6440
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover, ha, namenode
Affects Versions: 2.4.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hdfs-6440-cdh-4.5-full.patch


 Most of the work is already done to support more than 2 NameNodes (one 
 active, one standby). This would be the last bit to support running multiple 
 _standby_ NameNodes; one of the standbys should be available for fail-over.
 Mostly, this is a matter of updating how we parse configurations, some 
 complexity around managing the checkpointing, and updating a whole lot of 
 tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)