[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399741#comment-13399741
 ] 

Brandon Li commented on HDFS-3077:
----------------------------------

Hi Todd, 
I just read the new section but not the implementation yet. Please correct me 
if I am wrong. Looks like it could be a possible improvement: if the 
JournalNode who already received a prepare RPC with higher newEpoch number 
(possible?) can inform the writer(proposer), the writer can exit earlier in 
step 2 "Choosing a recovery". 

In step 3 "Accept RPC", I assume the URL that the writer sends to all the JNs 
is the URL of one JN which responded in step2. If that JN becomes inaccessible 
immediately, and thus other JNs can't sync themselves by downloading the 
finalized segment from that JN, the recovery process could be stuck?

If it could be stuck in step 3, an alternate way to sync lagging JNs is to let 
them contact other quorum number JNs to download the finalized segments. Given 
the requirement "All loggers must finalize the segment to the same length and 
contents", all the finalized segment with the same name should be identical in 
all JNs. Therefore, the lagging JN can download it from any other JN as long as 
that JN has the file.

                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, qjournal-design.pdf, 
> qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to