[
https://issues.apache.org/jira/browse/HBASE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733683#comment-15733683
]
Sean Busbey commented on HBASE-17276:
-------------------------------------
{code}
- LOG.warn("Batch Mutation did not pass sanity check", fsce);
+ final String msg = "Batch Mutation did not pass sanity check. ";
+ if (observedExceptions.hasSeenFailedSanityCheck()) {
+ LOG.warn(msg + " " + fsce.getMessage());
+ } else {
+ LOG.warn(msg, fsce);
+ observedExceptions.sawFailedSanityCheck();
+ }
{code}
nit: unlike the other messages, this one repeats the space both at the end of
{{msg}} and in the string concat in the LOG.
> Reduce log spam from WrongRegionException in large multi()'s
> ------------------------------------------------------------
>
> Key: HBASE-17276
> URL: https://issues.apache.org/jira/browse/HBASE-17276
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Josh Elser
> Assignee: Josh Elser
> Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17276.001.patch
>
>
> The following spam drives me up a wall in the regionserver log:
> {noformat}
> 2016-12-05 05:53:05,085 WARN
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020]
> regionserver.HRegion: Batch mutation had a row that does not belong to this
> region
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
> of range for doMiniBatchMutation on HRegion
> IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431.,
> startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8',
> getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94',
> row='\x0C\xD2\xA5\xA3\x99\xC7\xE0Q!\x15^\xA6\x90\x1E\xA3\xAD'
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> 2016-12-05 05:53:05,086 WARN
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020]
> regionserver.HRegion: Batch mutation had a row that does not belong to this
> region
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
> of range for doMiniBatchMutation on HRegion
> IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431.,
> startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8',
> getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94',
> row='\x0E\xE7\xFA[\x8D\x93;\xF4\xC7F\xF9\x85\x84\x85\xF3\x0E'
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> 2016-12-05 05:53:05,087 WARN
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020]
> regionserver.HRegion: Batch mutation had a row that does not belong to this
> region
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
> of range for doMiniBatchMutation on HRegion
> IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431.,
> startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8',
> getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94',
> row='\x16-\xFC\x99\xF5c\x08\xFA\x1D\x84\x86\xD2\x18\xB1\x03q'
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {noformat}
> With adequate replication traffic that is delayed or just slow, you can have
> a batch of 64MB of updates to a Region which are all on a different
> RegionServer by the time the RS processes it.
> In a run of IntegrationTestReplication that is particularly
> "slow"/oversaturated, I saw 1.591M log lines taken up with this message out
> of a total number of line of 1.597M lines (99.6% of the log). I propose that
> after the first WrongRegionException we see in {{doMiniBatchMutation}}, we
> stop printing out the rest of the stacktrace (save on 13 lines for every
> occurrence).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)