I have 2 clusters ( 1 master and 1 slave) on CDH 5.4 hbase 1.0
replication is working 95% of the time
but I do get the following WARN which I consider an error
Can't replicate because of an error on the remote cluster:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException):
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
Failed 11 actions: NotServingRegionException: 11 times,
at
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:227)
at
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:207)
at
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.getErrors(AsyncProcess.java:1563)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:1003)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:1017)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.batch(ReplicationSink.java:236)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.replicateEntries(ReplicationSink.java:160)
at
org.apache.hadoop.hbase.replication.regionserver.Replication.replicateLogEntries(Replication.java:198)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.replicateWALEntry(RSRpcServices.java:1584)
at
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20880)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
I consider this an error because my slave is missing data that I have in
the master. Is there a setting in hbase to keep trying to send ?
Cloudera management does try to restart and alerts me if the region for
some reason dies. As to why it dies, I am looking and that is a different
problem. but when the slave returns, I have an expectation that the
unconfirmed records would be resent.
Best practices would be helpful as well
All zookeepers in the slave are listed as peers
--
Abraham Tom
Email: [email protected]
Phone: 415-515-3621