[ 
https://issues.apache.org/jira/browse/HBASE-26533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

RCh updated HBASE-26533:
------------------------
    Description: 
While writing a custom RegionObserver and using 
InternalScan.checkOnlyMemStore() I stumbled upon an issue. The number of files 
opened by region servers would grow steadily and eventually region servers 
would crash with
{code:java}
2021-11-15 00:54:34,290 ERROR [MemStoreFlusher.1] regionserver.HStore: Failed 
to commit store file 
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/743071139057c819d7e6f7b59f065152/.tmp/f/394ba71102ec401d8779aa5f45819f84
java.io.IOException: Failed on local exception: java.io.IOException: Too many 
open files; Host Details : local host is: "<...removed...>/<...removed...>"; 
destination host is: "<...removed...>":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1544)
        at org.apache.hadoop.ipc.Client.call(Client.java:1486)
        at org.apache.hadoop.ipc.Client.call(Client.java:1385)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:800)

{code}
 

Another symptom is the following messages in region server logs:
{code:java}
2021-11-18 13:41:29,539 INFO 
[RS_COMPACTED_FILES_DISCHARGER-regionserver/li-datanode2:16020-7] 
regionserver.HStore: Can't archive compacted file 
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/ce6f08fd
d82967df94d1c83e289d3142/f/9b01b8bdea324c22a92f1ba6b386e050.6261a24c6a689d5406ae1ea87dc9bb9f
 because of either isCompactedAway=true or file has reference, 
isReferencedInReads=true, refCount=7, skipping for now.
{code}
 

The culprit is KeyValueScanner not being closed in 
StoreScanner.selectScannersFrom() before `continue`

[https://github.com/apache/hbase/blame/f000b775320330eb2f426f6b2a3b5e27a794a707/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L467]

 

 

  was:
While writing a custom RegionObserver and using 
InternalScan.checkOnlyMemStore() I stumbled upon an issue. The number of file 
opened by region servers would grow steadily and eventually region servers 
would crash with
{code}

2021-11-15 00:54:34,290 ERROR [MemStoreFlusher.1] regionserver.HStore: Failed 
to commit store file 
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/743071139057c819d7e6f7b59f065152/.tmp/f/394ba71102ec401d8779aa5f45819f84
java.io.IOException: Failed on local exception: java.io.IOException: Too many 
open files; Host Details : local host is: "<...removed...>/<...removed...>"; 
destination host is: "<...removed...>":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1544)
        at org.apache.hadoop.ipc.Client.call(Client.java:1486)
        at org.apache.hadoop.ipc.Client.call(Client.java:1385)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:800)

{code}

 

Another symptom is the following messages in region server logs:
{code}
2021-11-18 13:41:29,539 INFO 
[RS_COMPACTED_FILES_DISCHARGER-regionserver/li-datanode2:16020-7] 
regionserver.HStore: Can't archive compacted file 
hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/ce6f08fd
d82967df94d1c83e289d3142/f/9b01b8bdea324c22a92f1ba6b386e050.6261a24c6a689d5406ae1ea87dc9bb9f
 because of either isCompactedAway=true or file has reference, 
isReferencedInReads=true, refCount=7, skipping for now.
{code}

 

The culprit is KeyValueScanner not being closed in 
StoreScanner.selectScannersFrom() before `continue`

[https://github.com/apache/hbase/blame/f000b775320330eb2f426f6b2a3b5e27a794a707/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L467]

 

 


> KeyValueScanner might not be properly closed when using 
> InternalScan.checkOnlyMemStore()
> ----------------------------------------------------------------------------------------
>
>                 Key: HBASE-26533
>                 URL: https://issues.apache.org/jira/browse/HBASE-26533
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.3.6
>            Reporter: RCh
>            Priority: Minor
>
> While writing a custom RegionObserver and using 
> InternalScan.checkOnlyMemStore() I stumbled upon an issue. The number of 
> files opened by region servers would grow steadily and eventually region 
> servers would crash with
> {code:java}
> 2021-11-15 00:54:34,290 ERROR [MemStoreFlusher.1] regionserver.HStore: Failed 
> to commit store file 
> hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/743071139057c819d7e6f7b59f065152/.tmp/f/394ba71102ec401d8779aa5f45819f84
> java.io.IOException: Failed on local exception: java.io.IOException: Too many 
> open files; Host Details : local host is: "<...removed...>/<...removed...>"; 
> destination host is: "<...removed...>":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805)
>         at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1544)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1486)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1385)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>         at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:800)
> {code}
>  
> Another symptom is the following messages in region server logs:
> {code:java}
> 2021-11-18 13:41:29,539 INFO 
> [RS_COMPACTED_FILES_DISCHARGER-regionserver/li-datanode2:16020-7] 
> regionserver.HStore: Can't archive compacted file 
> hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/ce6f08fd
> d82967df94d1c83e289d3142/f/9b01b8bdea324c22a92f1ba6b386e050.6261a24c6a689d5406ae1ea87dc9bb9f
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=7, skipping for now.
> {code}
>  
> The culprit is KeyValueScanner not being closed in 
> StoreScanner.selectScannersFrom() before `continue`
> [https://github.com/apache/hbase/blame/f000b775320330eb2f426f6b2a3b5e27a794a707/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L467]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to