[ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030481#comment-16030481
 ] 

Allan Yang edited comment on HBASE-18132 at 5/31/17 1:35 AM:
-------------------------------------------------------------

{quote}
How is the default value of 30 seconds determined ?
{quote}
It doesn't matter,  the only requirement is that the interval of checking low 
replication is smaller than the interval of restarting datanodes. In our case, 
we set the restart interval of DN when rolling upgrade to 1 min. So we set the 
check interval to 30 seconds.
Thanks for your advice, [~tedyu]. I will modify the patch and upload a master 
patch later


was (Author: allan163):
{quote}
How is the default value of 30 seconds determined ?
{quote}
It doesn't matter,  the only requirement is that the interval of checking low 
replication is smaller than the interval of restarting datanodes. In our case, 
we set the restart interval of DN at rolling start to 1 min. So we set the 
check interval to 30 seconds.
Thanks for your advice, [~tedyu]. I will modify the patch and upload a master 
patch later

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-18132
>                 URL: https://issues.apache.org/jira/browse/HBASE-18132
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.4.0, 1.1.10
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>         Attachments: HBASE-18132-branch-1.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to