[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472112#comment-13472112
 ] 

Sanjay Radia commented on HDFS-3077:
------------------------------------

{quote}
The updated journal file isn't sufficient because it doesn't record information 
about whether it was an accepted recovery proposal or whether it was just left 
over at the last write. You need to ensure the property that, if the recovery 
coordinator thinks a value is accepted, then no different recovery will be 
accepted in the future (otherwise you risk having two different finalized 
lengths for the same log segment). In order to do so, you need to wait until a 
quorum of nodes are Finalized before you know that any future recovery will be 
able to rely only on the finalization state.

I don't know enough about the details of the ZAB implementation to understand 
why they can get away without this, if in fact they can. My guess is that it's 
because the transaction IDs themselves have the epoch number as their high 
order bits, and hence you can't ever confuse the first txn of epoch N+1 with 
the last transaction of epoch N.
{quote}
Yes, ZAB avoids this because epoch and txid are combined.
Lets please add the counter example that you describe above in the doc (if it 
is already there just add a comment that the example 
explains why the extra persistent info is needed.)
                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: QuorumJournalManager (HDFS-3077)
>
>         Attachments: hdfs-3077-partial.txt, hdfs-3077-test-merge.txt, 
> hdfs-3077.txt, hdfs-3077.txt, hdfs-3077.txt, hdfs-3077.txt, hdfs-3077.txt, 
> hdfs-3077.txt, hdfs-3077.txt, qjournal-design.pdf, qjournal-design.pdf, 
> qjournal-design.pdf, qjournal-design.pdf, qjournal-design.pdf, 
> qjournal-design.pdf, qjournal-design.tex, qjournal-design.tex
>
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to