[jira] [Updated] (HBASE-25287) Forgetting to unbuffer streams results in many CLOSE_WAIT sockets when loading files

Xiaolin Ha (Jira) Sun, 15 Nov 2020 22:56:02 -0800


     [ 
https://issues.apache.org/jira/browse/HBASE-25287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xiaolin Ha updated HBASE-25287:
-------------------------------
    Description: 
HBASE-9393 found seek+read will leave many CLOSE_WAIT sockets without stream 
unbuffer, which can free sockets and file descriptors held by the stream. 

In our cluster RSes with about one hundred thousand store files, we found the 
number of  CLOSE_WAIT sockets increases with the number of regions opened, and 
can up to the operating system open files limit 1000000.

 
{code:java}
2020-11-12 20:19:02,452 WARN  [1282990092@qtp-220038608-1 - Acceptor0 
[email protected]:16030] mortbay.log: EXCEPTION
java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at 
org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
        at 
org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:686)
        at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192)
        at 
org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
        at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{code}
 
{code:java}
[hbase@gha-data-hbase-cat0053 hbase]$ ulimit -SHn
1000000
{code}
 

 

The reason of the problem is, when store file opened, 
{code:java}
private void open() throws IOException {
  fileInfo.initHDFSBlocksDistribution();
  long readahead = fileInfo.isNoReadahead() ? 0L : -1L;
  ReaderContext context = fileInfo.createReaderContext(false, readahead, 
ReaderType.PREAD);
  fileInfo.initHFileInfo(context);
  StoreFileReader reader = fileInfo.preStoreFileReaderOpen(context, cacheConf);
  if (reader == null) {
    reader = fileInfo.createReader(context, cacheConf);
    fileInfo.getHFileInfo().initMetaAndIndex(reader.getHFileReader());
  }
....{code}
only createReader() unbuffered the stream. In initMetaAndIndex(), using the 
stream to read blocks, so it needs to unbuffer() the socket , too.

We can just add try before fileInfo.initHFileInfo(context); and finally 
unbuffer() the stream at the end of the open() function.

We fixed it on our cluster, the number of CLOSE_WAIT reduced to about 0. 

 

 

  was:
[HBASE-9393|https://issues.apache.org/jira/browse/HBASE-9393] found seek+read 
will leave many CLOSE_WAIT sockets without stream unbuffer, which can free 
sockets and file descriptors held by the stream. 

In our cluster RSes with about one hundred thousand store files, we found the 
number of  CLOSE_WAIT sockets increases with the number of regions opened, and 
can up to the operating system open files limit 1000000.

 
{code:java}
2020-11-12 20:19:02,452 WARN  [1282990092@qtp-220038608-1 - Acceptor0 
[email protected]:16030] mortbay.log: EXCEPTION
java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at 
org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
        at 
org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:686)
        at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192)
        at 
org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
        at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{code}
 
{code:java}
[hbase@gha-data-hbase-cat0053 hbase]$ ulimit -SHn
1000000
{code}
 

 

The reason of the problem is, when store file opened, 
{code:java}
private void open() throws IOException {
  fileInfo.initHDFSBlocksDistribution();
  long readahead = fileInfo.isNoReadahead() ? 0L : -1L;
  ReaderContext context = fileInfo.createReaderContext(false, readahead, 
ReaderType.PREAD);
  fileInfo.initHFileInfo(context);
  StoreFileReader reader = fileInfo.preStoreFileReaderOpen(context, cacheConf);
  if (reader == null) {
    reader = fileInfo.createReader(context, cacheConf);
    fileInfo.getHFileInfo().initMetaAndIndex(reader.getHFileReader());
  }
....{code}
only createReader() unbuffered the stream. In initMetaAndIndex(), used the 
stream to read blocks, so it need to unbuffer() the socket , too.

We can just add try before fileInfo.initHFileInfo(context); and finally 
unbuffer() the stream at the end of the open() function.

We fixed it on our cluster, the number of CLOSE_WAIT reduced to about 0. 

 

 


> Forgetting to unbuffer streams results in many CLOSE_WAIT sockets when 
> loading files
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-25287
>                 URL: https://issues.apache.org/jira/browse/HBASE-25287
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Xiaolin Ha
>            Priority: Major
>         Attachments: 1605328358304-image.png, 1605328417888-image.png, 
> 1605504914256-image.png
>
>
> HBASE-9393 found seek+read will leave many CLOSE_WAIT sockets without stream 
> unbuffer, which can free sockets and file descriptors held by the stream. 
> In our cluster RSes with about one hundred thousand store files, we found the 
> number of  CLOSE_WAIT sockets increases with the number of regions opened, 
> and can up to the operating system open files limit 1000000.
>  
> {code:java}
> 2020-11-12 20:19:02,452 WARN  [1282990092@qtp-220038608-1 - Acceptor0 
> [email protected]:16030] mortbay.log: EXCEPTION
> java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at 
> org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
>         at 
> org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:686)
>         at 
> org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192)
>         at 
> org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
>         at 
> org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
>         at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> {code}
>  
> {code:java}
> [hbase@gha-data-hbase-cat0053 hbase]$ ulimit -SHn
> 1000000
> {code}
>  
>  
> The reason of the problem is, when store file opened, 
> {code:java}
> private void open() throws IOException {
>   fileInfo.initHDFSBlocksDistribution();
>   long readahead = fileInfo.isNoReadahead() ? 0L : -1L;
>   ReaderContext context = fileInfo.createReaderContext(false, readahead, 
> ReaderType.PREAD);
>   fileInfo.initHFileInfo(context);
>   StoreFileReader reader = fileInfo.preStoreFileReaderOpen(context, 
> cacheConf);
>   if (reader == null) {
>     reader = fileInfo.createReader(context, cacheConf);
>     fileInfo.getHFileInfo().initMetaAndIndex(reader.getHFileReader());
>   }
> ....{code}
> only createReader() unbuffered the stream. In initMetaAndIndex(), using the 
> stream to read blocks, so it needs to unbuffer() the socket , too.
> We can just add try before fileInfo.initHFileInfo(context); and finally 
> unbuffer() the stream at the end of the open() function.
> We fixed it on our cluster, the number of CLOSE_WAIT reduced to about 0. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HBASE-25287) Forgetting to unbuffer streams results in many CLOSE_WAIT sockets when loading files

Reply via email to