[
https://issues.apache.org/jira/browse/HBASE-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273912#comment-13273912
]
Zhihong Yu commented on HBASE-5959:
-----------------------------------
For a patch of this size, review board should be used.
Below is part 1 of my review.
It is necessary to see the benefit of stochastic balancer in a real cluster
compared with existing balancer.
TestRegionRebalancing is an important test which currently exercises default
balancer. It would be nice to apply new balancer there - or create a similar
test class.
{code}
+ * {@link AssignmentManager} to assign regions in the edge cases. It doesn't
{code}
Are edge cases the only ones covered by BaseLoadBalancer ?
{code}
+public class ClusterLoadState {
{code}
Please add javadoc for the above class.
{code}
+ private final NavigableMap<ServerAndLoad, List<HRegionInfo>> serversByLoad;
{code}
The above map doesn't contain region load information. Can region load be added
?
{code}
+ protected List<ServerName> getInternalTopBlockLocations( HRegionInfo region)
{
{code}
A better name for the above method would be internalGetTopBlockLocations().
Remove the space after '('.
{code}
+ * @param fs
+ * the filesystem
{code}
There is no need to put parameter explanation on separate line.
There're two almost identical getTableDescriptor() methods in master package:
{code}
private HTableDescriptor getTableDescriptor(byte[] tableName)
src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
private HTableDescriptor getTableDescriptor(byte[] tableName)
src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
{code}
It would be nice to put it in a utility class.
For StochasticLoadBalancer:
{code}
+ * randomly try and mutate up to the cluster to Cprime. If F(Cprime) < F(C)
then
{code}
Remove 'up to' above.
{code}
+ public void setMasterServices(MasterServices masterServices) {
+ this.services = masterServices;
+ }
{code}
RegionLocationFinder has MasterServices member which should be set in the above
method.
{code}
+ * should always approach
{code}
Please finish the above sentence for balanceCluster().
{code}
+ // No need to balance one cluster.
+ if (clusterState.size() <= 1) {
{code}
The above check should be placed at the beginning of the method. 'one cluster'
-> 'cluster with one server'
{code}
+ // Perform a stocastic walk to see if we can get a good fit.
{code}
'stocastic' -> 'stochastic'
{code}
+ HRegionInfo lRegion = pickRandmoRegion(leftRegionList, 0);
+ HRegionInfo rRegion = pickRandmoRegion(rightRegionList, 0.5);
{code}
Please explain why selection of move is not symmetrical.
{code}
+ boolean keep = RANDOM.nextFloat() < (0.01 * (1 - scale(0, maxSteps,
stepNum)));
+ return keep;
{code}
A single line should be enough - return directly.
{code}
+ * @return
+ */
+ private Map<HRegionInfo, ServerName> createRegionMapping(
{code}
Please add javadoc for return value.
{code}
+ double regionCountSkewCost = 100 * computeSkewLoadCost(clusterState);
+ double tableSkewCost = 50 * computeTableSkewLoadCost(clusterState);
+ double localityCost = 40 * computeDataLocalityCost(clusterState);
{code}
May I ask how the above coefficients were determined ? Should user be able to
change them through config parameters ?
For computeSkewLoadCost():
{code}
+ double max = (mean * (clusterState.size() - 1)) + (Math.abs(mean -
numRegions));
{code}
numRegions is total number of regions, meaning it is greater than mean. The
Math.abs() isn't necessary above.
May I ask how the above max is derived ?
> Add other load balancers
> ------------------------
>
> Key: HBASE-5959
> URL: https://issues.apache.org/jira/browse/HBASE-5959
> Project: HBase
> Issue Type: New Feature
> Components: master
> Affects Versions: 0.96.0
> Reporter: Elliott Clark
> Assignee: Elliott Clark
> Attachments: HBASE-5959-0.patch, HBASE-5959-1.patch,
> HBASE-5959-2.patch, HBASE-5959-3.patch
>
>
> Now that balancers are pluggable we should give some options.b
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira