[
https://issues.apache.org/jira/browse/HBASE-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785629#comment-13785629
]
Sergey Shelukhin commented on HBASE-9703:
-----------------------------------------
Does &= short circuit? Might not call other methods after one failure.
Spacing is wrong in places.
Otherwise looks reasonable.
Can you post RB?
> DistributedHBaseCluster should not throw exceptions, but do a best effort
> restore
> ---------------------------------------------------------------------------------
>
> Key: HBASE-9703
> URL: https://issues.apache.org/jira/browse/HBASE-9703
> Project: HBase
> Issue Type: Improvement
> Reporter: Enis Soztutar
> Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-9703_v1.patch
>
>
> At the end of integration tests, we are calling
> DistributedCluster.restoreCluster() in case CM has killed nodes so that we
> can leave the cluster in the same state that we have taken over.
> However, if CM is not used in a test (for example ITLoadAndVerify), but some
> regions servers die, or an external daemon kills the servers, we will still
> try to restore at the end of the test which may or may not succeed (depending
> on configuration, the region server going being unaccessible, etc. )
> We can do two things, either do a best effort restore cluster which will not
> fail the test if there are any errors, or we can skip running restore if no
> disruptive actions have taken place.
> I am leaning towards the former one, since if an RS goes down with or w/o CM
> due to bad disk etc., we cannot restore the cluster, but we should not fail
> the test in this case.
--
This message was sent by Atlassian JIRA
(v6.1#6144)