Hello,

Unfortunately I don’t have good guidance on what to tune this to. What I
can say though is that this feature will be disabled by default starting
with version 2.5.0. Part of the reason for that is we determined it is too
aggressive but didn’t yet have good guidance on a better default.

So I would recommend disabling this feature by setting
hbase.region.store.parallel.put.limit to 0 (zero) in your hbase-site.xml.

The idea behind the feature is good, so if you’d prefer to leave it enabled
I’d recommend doing some load testing based on your use case and hardware
to determine a value that works for you. The general idea is that it tries
to avoid painful write contention by limiting the number of parallel write
operations to a single region at a time, but how many parallel writers you
can withstand will be hardware dependent.

On Wed, Mar 23, 2022 at 6:02 AM Hamado Dene <hamadod...@yahoo.com.invalid>
wrote:

> Hello Hbase Community,
> On our production environment we are experiencing several Exception such
> as:
> 2022-03-23 10:52:38,843 INFO  [AsyncFSWAL-0-hdfs://hadoopcluster/hbase]
> wal.AbstractFSWAL: Slow sync cost: 120 ms, current pipeline:
> [DatanodeInfoWithStorage[10.211.3.11:50010,DS-b8181e87-2f63-47d5-a9f2-4d9ca8216d93,DISK],
> DatanodeInfoWithStorage[10.211.3.12:50010,DS-f32aa630-e63c-4aee-a77b-a04128edee31,DISK]]2022-03-23
> 10:54:15,631 WARN  [hconnection-0x63191a3a-shared-pool6-t322]
> client.AsyncRequestFutureImpl: b7b2a5f50bdaa3794d185ce.:n Above
> parallelPutToStoreThreadLimit(10)        at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1083)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:986)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:951)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2783)
>       at
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42290)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>   at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) on
> acv-db16-hd.diennea.lan,16020,1648028001827, tracking started Wed Mar 23
> 10:54:12 CET 2022; NOT retrying, failed=6 -- final attempt!2022-03-23
> 10:54:15,632 ERROR
> [RpcServer.replication.FPBQ.Fifo.handler=2,queue=0,port=16020]
> regionserver.ReplicationSink: Unable to accept edit
> because:org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> Failed 6 actions: org.apache.hadoop.hbase.RegionTooBusyException:
> StoreTooBusy,mn1_5276_huserlog,,1647637376109.27fd761a2b7b2a5f50bdaa3794d185ce.:n
> Above parallelPutToStoreThreadLimit(10)        at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1083)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:986)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:951)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2783)
>       at
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42290)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>   at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318):
> 6 times, servers with issues: acv-db16-hd,16020,1648028001827        at
> org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54)
>       at
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1204)
>       at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:453)
>   at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:436)        at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.batch(ReplicationSink.java:421)
>       at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.replicateEntries(ReplicationSink.java:251)
>       at
> org.apache.hadoop.hbase.replication.regionserver.Replication.replicateLogEntries(Replication.java:178)
>       at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.replicateWALEntry(RSRpcServices.java:2311)
>       at
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29752)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>   at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> What is the best way to manage this issue?I saw on the net the possibility
> of increasing the propery hbase.region.store.parallel.put.limit, but in the
> hbase documentation I don't find any reference about it.Is
> <http://it.Is>
> this property still valid? It can be enabled at the level of hdfs-site.xml
> Thanks,
>
> Hamado Dene
>

Reply via email to