RCh created HBASE-26533:
---------------------------
Summary: KeyValueScanner might not be properly closed when using
InternalScan.checkOnlyMemStore()
Key: HBASE-26533
URL: https://issues.apache.org/jira/browse/HBASE-26533
Project: HBase
Issue Type: Bug
Affects Versions: 2.3.6
Reporter: RCh
While writing a custom RegionObserver and using
InternalScan.checkOnlyMemStore() I stumbled upon an issue. The number of file
opened by region servers would grow steadily and eventually region servers
would crash with
{code}
2021-11-15 00:54:34,290 ERROR [MemStoreFlusher.1] regionserver.HStore: Failed
to commit store file
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/743071139057c819d7e6f7b59f065152/.tmp/f/394ba71102ec401d8779aa5f45819f84
java.io.IOException: Failed on local exception: java.io.IOException: Too many
open files; Host Details : local host is: "<...removed...>/<...removed...>";
destination host is: "<...removed...>":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1544)
at org.apache.hadoop.ipc.Client.call(Client.java:1486)
at org.apache.hadoop.ipc.Client.call(Client.java:1385)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:800)
{code}
Another symptom is the following messages in region server logs:
{code}
2021-11-18 13:41:29,539 INFO
[RS_COMPACTED_FILES_DISCHARGER-regionserver/li-datanode2:16020-7]
regionserver.HStore: Can't archive compacted file
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/ce6f08fd
d82967df94d1c83e289d3142/f/9b01b8bdea324c22a92f1ba6b386e050.6261a24c6a689d5406ae1ea87dc9bb9f
because of either isCompactedAway=true or file has reference,
isReferencedInReads=true, refCount=7, skipping for now.
{code}
The culprit is KeyValueScanner not being closed in
StoreScanner.selectScannersFrom() before `continue`
[https://github.com/apache/hbase/blame/f000b775320330eb2f426f6b2a3b5e27a794a707/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L467]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)