jiang he created HBASE-30248:
--------------------------------
Summary: Negative hbase.snapshot.region.timeout causes
RegionServer abort during snapshot procedure initialization
Key: HBASE-30248
URL: https://issues.apache.org/jira/browse/HBASE-30248
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 2.5.14
Environment: ## Environment
- HBase version: 2.5.14
- Java: OpenJDK 11
- Mode: local standalone HBase
- OS: Linux / WSL2
Reporter: jiang he
When `hbase.snapshot.region.timeout` is configured as a negative value, the
RegionServer aborts during initialization.
The negative value is passed as the keep-alive time to a
`ThreadPoolExecutor`, which throws `IllegalArgumentException`.
## Reproducer
Use the following `hbase-site.xml` entries:
```xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>/tmp/hbase-tmp</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<property>
<name>hbase.snapshot.region.timeout</name>
<value>-300000</value>
</property>
</configuration>
Then start HBase.
Actual result
The RegionServer aborts:
ERROR [RS:0;...] regionserver.HRegionServer:
***** ABORTING region server ... Initialization of RS failed. Hence aborting
RS. *****
java.lang.IllegalArgumentException
at
java.base/java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1293)
at
java.base/java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1215)
at
org.apache.hadoop.hbase.procedure.ProcedureMember.defaultPool(ProcedureMember.java:86)
at
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager.initialize(RegionServerSnapshotManager.j
ava:391)
Root cause
hbase.snapshot.region.timeout is read in
RegionServerSnapshotManager.initialize and used as the keep-alive time when
creating the snapshot procedure member thread pool.
A negative value is invalid for ThreadPoolExecutor.
Expected result
HBase should validate hbase.snapshot.region.timeout and reject negative
values with a clear configuration error, or fall
back to a safe default.
The RegionServer should not abort with a raw IllegalArgumentException.
Notes
The failure was found by configuration fuzzing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)