[
https://issues.apache.org/jira/browse/HBASE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964905#action_12964905
]
HBase Review Board commented on HBASE-3282:
-------------------------------------------
Message from: "Jonathan Gray" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1259/
-----------------------------------------------------------
(Updated 2010-11-29 11:43:07.682958)
Review request for hbase and stack.
Changes
-------
Makes DeadServers private. It was still accessed by my TestRollingRestart test
so I had to make a small change to how that worked.
Also added some additional testing in TestDeadServers that verifies the new
boolean check works as expected and also that the max capacity works as
expected.
Summary
-------
We currently let go of dead servers once we finish their shutdown. We should
hang on to them longer to deal with things like network partitions.
I'm not a fan of SoftReferences so I decided on another approach. DeadServers
now has a maximum number of servers to hold on to in the set (default 100).
Once it reaches the max, it evicts the oldest.
More code than I had hoped but nothing too crazy.
This addresses bug HBASE-3282.
http://issues.apache.org/jira/browse/HBASE-3282
Diffs (updated)
-----
branches/0.90/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java
1040242
branches/0.90/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
1040242
branches/0.90/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
1040242
branches/0.90/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
1040242
branches/0.90/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java
1040242
branches/0.90/src/test/java/org/apache/hadoop/hbase/master/TestRollingRestart.java
1040242
Diff: http://review.cloudera.org/r/1259/diff
Testing
-------
Running unit tests now.
Thanks,
Jonathan
> Need to retain DeadServers to ensure we don't allow previously expired RS
> instances to rejoin cluster
> -----------------------------------------------------------------------------------------------------
>
> Key: HBASE-3282
> URL: https://issues.apache.org/jira/browse/HBASE-3282
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.0
> Reporter: Jonathan Gray
> Assignee: Jonathan Gray
> Fix For: 0.90.0, 0.92.0
>
>
> Currently we clear a server from the deadserver set once we finish processing
> it's shutdown. However, certain circumstances (network partitions, race
> conditions) could lead to the RS not doing a check-in until after the
> shutdown has been processed. As-is, this RS will now be let back in to the
> cluster rather than rejected with YouAreDeadException.
> We should hang on to the dead servers so we always reject them.
> One concern is that the set will grow indefinitely. One recommendation by
> stack is to use SoftReferences.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.