[ 
https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Chen updated HBASE-13965:
-----------------------------
    Attachment: HBASE-13965-branch-1-v2.patch

Updates:
1. wrapped a long line (> 100)

The failed test from last patch seems not related. Here is the log:

testWalRollOnLowReplication(org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS)
  Time elapsed: 3.804 sec  <<< ERROR!
java.lang.RuntimeException: sync aborted
        at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:491)
        at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.insert(WALProcedureStore.java:334)
        at 
org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS.testWalRollOnLowReplication(TestWALProcedureStoreOnHDFS.java:189)
Caused by: org.apache.hadoop.ipc.RemoteException: File 
/test-logs/state-00000000000000000006.log could only be replicated to 2 nodes 
instead of minReplication (=3).  There are 3 datanode(s) running and 3 node(s) 
are excluded in this operation.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at org.apache.hadoop.ipc.Client.call(Client.java:1411)
        at org.apache.hadoop.ipc.Client.call(Client.java:1364)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)

> Stochastic Load Balancer JMX Metrics
> ------------------------------------
>
>                 Key: HBASE-13965
>                 URL: https://issues.apache.org/jira/browse/HBASE-13965
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, metrics
>            Reporter: Lei Chen
>            Assignee: Lei Chen
>             Fix For: 2.0.0
>
>         Attachments: 13965-addendum.txt, HBASE-13965-branch-1-v2.patch, 
> HBASE-13965-branch-1.patch, HBASE-13965-v10.patch, HBASE-13965-v11.patch, 
> HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, 
> HBASE-13965-v6.patch, HBASE-13965-v7.patch, HBASE-13965-v8.patch, 
> HBASE-13965-v9.patch, HBASE-13965_v2.patch, HBase-13965-JConsole.png, 
> HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png
>
>
> Today’s default HBase load balancer (the Stochastic load balancer) is cost 
> function based. The cost function weights are tunable but no visibility into 
> those cost function results is directly provided.
> A driving example is a cluster we have been tuning which has skewed rack size 
> (one rack has half the nodes of the other few racks). We are tuning the 
> cluster for uniform response time from all region servers with the ability to 
> tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and 
> RegionCountSkew Cost is difficult without a way to attribute each cost 
> function’s contribution to overall cost. 
> What this jira proposes is to provide visibility via JMX into each cost 
> function of the stochastic load balancer, as well as the overall cost of the 
> balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to