[
https://issues.apache.org/jira/browse/HBASE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670104#comment-13670104
]
Devaraj Das commented on HBASE-8344:
------------------------------------
bq. if favoredNodes.size < 3, you will get an out of bounds exception.
In the method implementation where this code appears, there is a check just
above the block of code that makes sure favoredNodes.size is equal to 3 before
proceeding.
bq. FavoredNodes.favoredNodesMap is a ConcurrentHashMap, but
updateFavoredNodesMap() and getFavoredNodes() are also synchronized. Is this
intended?
Yeah methods need not be synchronized as of the current code since the
underlying datastructure is synchronized but in spirit I intended for the
methods to be synchronized. If you feel strongly about it, I can remove the
synchronized on the methods.
bq. FavoredNodeAssignmentHelper.FAVORED_NODES_NUM -> should this be read from
configuration default replication count for dfs? What if we are running with
replication 2 or 4, we would still have 3 favored nodes?
This is a TODO overall. Currently, yeah, there is an assumption that the
replication is 3. If the replication is 2 or 4, things should work (as opposed
to crashing) and will be status quo (for example, the master will assume there
is a tertiary regionserver for a region even with a replication of 2 but in
reality the tertiary server is as good as any random regionserver w.r.t hosting
the region's blocks).
In practice, we will mostly have a replication factor of 3 to have the failure
zones taken care of and so it should be okay..
bq. FavoredNodeAssignmentManager.initialize() looks expensive. Do you need to
call this for every roundRobin / random assignment. It seems that that info can
only change if RS are going down / up.
I assume you meant FavoredNodeAssignmentHelper.initialize.. Hmm.. That's how it
is in the current codebase, and there are some datastructures that are valid
per assignment (like uniqueRackList). Given all the other work we do in
assignments (including ZK), this is probably going to be noise. But sure, I can
look at this issue in a follow up..
> Improve the assignment when node failures happen to choose the secondary RS
> as the new primary RS
> -------------------------------------------------------------------------------------------------
>
> Key: HBASE-8344
> URL: https://issues.apache.org/jira/browse/HBASE-8344
> Project: HBase
> Issue Type: Sub-task
> Reporter: Devaraj Das
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.95.2
>
> Attachments: hbase-8344-1.txt, hbase-8344-2.1.txt, hbase-8344-2.2.txt
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira