[ 
https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467205
 ] 

Konstantin Shvachko commented on HADOOP-702:
--------------------------------------------

I'd like to emphasize the changes in the design of the upgrade and the behavior 
of the system in general.
People expressed different opinions during previous discussion so if anybody 
sees problems with the new
approach now would be a good time to speak up.

- No FSSIDs means that there will no possibility to create multiple snapshots 
of the fs.
Only one snapshot at any given time.
Something like what Dough calls above "filesystem checkpoints" will not be 
possible any more.

- The requirement of exact release version match will result in that there will 
be no option for administrators
to stop the name-node (without stopping data-nodes) and restart it with updated 
software. Even if no
changes to the data layout or data-node protocol have been done.

- Another important issue in the new design is that data-nodes will decide on 
their own whether to upgrade
or discard old fs state based on comparison of the local data layout version 
and the name-node LV.
That is, even if you start name-node in regular mode some data-nodes, which 
missed previous upgrade(s)
or discard(s), can decide to do it on their own.

I wrote a test that creates hard links of block files in a new directory. On my 
machine a hard link creation
takes about 10 milliseconds, which is 6,000 blocks per minute.
Depending on your data-node size you can calculate the cluster startup delay.

> DFS Upgrade Proposal
> --------------------
>
>                 Key: HADOOP-702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-702
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>         Assigned To: Konstantin Shvachko
>         Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html, 
> DFSUpgradeProposal3.html, FSStateTransition.html, TestPlan-HdfsUpgrade.html, 
> TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case 
> of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost 
> automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to