RCh created HBASE-26533:
---------------------------

             Summary: KeyValueScanner might not be properly closed when using 
InternalScan.checkOnlyMemStore()
                 Key: HBASE-26533
                 URL: https://issues.apache.org/jira/browse/HBASE-26533
             Project: HBase
          Issue Type: Bug
    Affects Versions: 2.3.6
            Reporter: RCh


While writing a custom RegionObserver and using 
InternalScan.checkOnlyMemStore() I stumbled upon an issue. The number of file 
opened by region servers would grow steadily and eventually region servers 
would crash with
{code}

2021-11-15 00:54:34,290 ERROR [MemStoreFlusher.1] regionserver.HStore: Failed 
to commit store file 
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/743071139057c819d7e6f7b59f065152/.tmp/f/394ba71102ec401d8779aa5f45819f84
java.io.IOException: Failed on local exception: java.io.IOException: Too many 
open files; Host Details : local host is: "<...removed...>/<...removed...>"; 
destination host is: "<...removed...>":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1544)
        at org.apache.hadoop.ipc.Client.call(Client.java:1486)
        at org.apache.hadoop.ipc.Client.call(Client.java:1385)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:800)

{code}

 

Another symptom is the following messages in region server logs:
{code}
2021-11-18 13:41:29,539 INFO 
[RS_COMPACTED_FILES_DISCHARGER-regionserver/li-datanode2:16020-7] 
regionserver.HStore: Can't archive compacted file 
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/ce6f08fd
d82967df94d1c83e289d3142/f/9b01b8bdea324c22a92f1ba6b386e050.6261a24c6a689d5406ae1ea87dc9bb9f
 because of either isCompactedAway=true or file has reference, 
isReferencedInReads=true, refCount=7, skipping for now.
{code}

 

The culprit is KeyValueScanner not being closed in 
StoreScanner.selectScannersFrom() before `continue`

[https://github.com/apache/hbase/blame/f000b775320330eb2f426f6b2a3b5e27a794a707/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L467]

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to