[ 
https://issues.apache.org/jira/browse/HDFS-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177866#comment-13177866
 ] 

Todd Lipcon commented on HDFS-2737:
-----------------------------------

A couple options here:

*1) Add a thread to the NN which rolls periodically (based on time or # txns)*

This would be advantageous if we had some use cases for keeping edit log 
segments short even absent HA. The only case Aaron and I could brainstorm would 
be for backups, where it's a little easier to backup a finalized file compared 
to a rolling one. But we can satisfy this easily by adding a command line tool 
to trigger a roll, which a backup script can use. So it's not super compelling.

2) Add a new thread to the SBN which makes an IPC to the active and asks it to 
roll periodically

Advantage here is simplicity.

3) Add some code to the EditLogTailer thread in the SBN which makes a call to 
the active NN to trigger a roll when necessary (eg when the 
PendingDatanodeMessage queue is too large, or it's been too long since it has 
read any edits).

Advantage here is that the real motivation for the rolls is the EditLogTailer 
itself. We want to keep lag low (for fast recovery) and also keep the pending 
datanode queue small (to fit within memory bounds). By putting the trigger 
here, we can directly inspect those two variables, and trigger rolls when 
necessary.

So I'm thinking option 3 is the best.
                
> HA: Automatically trigger log rolls periodically on the active NN
> -----------------------------------------------------------------
>
>                 Key: HDFS-2737
>                 URL: https://issues.apache.org/jira/browse/HDFS-2737
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>
> Currently, the edit log tailing process can only read finalized log segments. 
> So, if the active NN is not rolling its logs periodically, the SBN will lag a 
> lot. This also causes many datanode messages to be queued up in the 
> PendingDatanodeMessage structure.
> To combat this, the active NN needs to roll its logs periodically -- perhaps 
> based on a time threshold, or perhaps based on a number of transactions. I'm 
> not sure yet whether it's better to have the NN roll on its own or to have 
> the SBN ask the active NN to roll its logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to