[ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546236#comment-13546236
 ] 

stack commented on HBASE-7407:
------------------------------

[~nkeywal] Can you add the following succinct comment to the TM code in a 
prominent place the next time you make a patch related to it? I think it will 
help others who come later messing in this area.

"The TM is an expensive feature, we should not need it on a standard workflow 
(and failures are standard  (but its function of 'last chance checker' is 
useful: it's a safety net, but it should not be anything else)."
                
> TestMasterFailover under tests some cases and over tests some others
> --------------------------------------------------------------------
>
>                 Key: HBASE-7407
>                 URL: https://issues.apache.org/jira/browse/HBASE-7407
>             Project: HBase
>          Issue Type: Bug
>          Components: master, Region Assignment, test
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>         Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch
>
>
> The tests are done with this settings:
>     conf.setInt("hbase.master.assignment.timeoutmonitor.period", 2000);
>     conf.setInt("hbase.master.assignment.timeoutmonitor.timeout", 4000);
> As a results:
> 1) some tests seems to work, but in real life, the recovery would take 5 
> minutes or more, as in production there always higher. So we don't see the 
> real issues.
> 2) The tests include specific cases that should not happen in production. It 
> works because the timeout catches everything, but these scenarios do not need 
> to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to