[
https://issues.apache.org/jira/browse/HADOOP-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644407#action_12644407
]
Konstantin Shvachko commented on HADOOP-4563:
---------------------------------------------
A block on the name-node (FSNamesystem class) can belong to the following block
collections:
# blocksMap
# CorruptReplicasMap
# recentInvalidateSets
# excessReplicateMap
# neededReplications
# pendingReplications
Are there more lists out there?
A state of a block {{b}} in this respect is defined by a boolean vector
{{State(b) = <v1,v2,...v6>}}, where {{vi}} characterizes whether block {{b}}
belongs or not to the respective collection above.
BlockMapStateMachine should define how the vector changes when we add or remove
a block, add or remove a replica, etc. E.g.
{code}
addBlock: State-cur(b) --> State-new(b)
...
removeReplica: State-cur(b) --> State-new(b)
{code}
The state machine should be implemented as a method in {{FSNamesystem}}. E.g.
{code}
State getNewBlockState(State current, Operation op);
{code}
We call this method whenever something changes to a block, and then update the
collections according to the returned boolean vector.
> A State Machine for name-node blocks.
> -------------------------------------
>
> Key: HADOOP-4563
> URL: https://issues.apache.org/jira/browse/HADOOP-4563
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs
> Reporter: Konstantin Shvachko
> Fix For: 0.20.0
>
>
> Blocks on the name-node can belong to different collections like the
> blocksMap, under-replicated, over-replicated lists, etc.
> It is getting more and more complicated to keep the lists consistent.
> It would be good to formalize the movement of the blocks between the
> collections using a state machine.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.