[ 
https://issues.apache.org/jira/browse/HBASE-25287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260771#comment-17260771
 ] 

Hudson commented on HBASE-25287:
--------------------------------

Results for branch branch-1
        [build #74 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/74/]:
 (x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/74//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/74//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/74//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Forgetting to unbuffer streams results in many CLOSE_WAIT sockets when 
> loading files
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-25287
>                 URL: https://issues.apache.org/jira/browse/HBASE-25287
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.4, 2.5.0, 2.4.1
>
>         Attachments: 1605328358304-image.png, 1605328417888-image.png, 
> 1605504914256-image.png
>
>
> HBASE-9393 found seek+read will leave many CLOSE_WAIT sockets without stream 
> unbuffer, which can free sockets and file descriptors held by the stream. 
> In our cluster RSes with about one hundred thousand store files, we found the 
> number of  CLOSE_WAIT sockets increases with the number of regions opened, 
> and can up to the operating system open files limit 1000000.
>  
> {code:java}
> 2020-11-12 20:19:02,452 WARN  [1282990092@qtp-220038608-1 - Acceptor0 
> SelectChannelConnector@0.0.0.0:16030] mortbay.log: EXCEPTION
> java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at 
> org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
>         at 
> org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:686)
>         at 
> org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192)
>         at 
> org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
>         at 
> org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
>         at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> {code}
>  
> {code:java}
> [hbase@gha-data-hbase-cat0053 hbase]$ ulimit -SHn
> 1000000
> {code}
>  
>  
> The reason of the problem is, when store file opened, 
> {code:java}
> private void open() throws IOException {
>   fileInfo.initHDFSBlocksDistribution();
>   long readahead = fileInfo.isNoReadahead() ? 0L : -1L;
>   ReaderContext context = fileInfo.createReaderContext(false, readahead, 
> ReaderType.PREAD);
>   fileInfo.initHFileInfo(context);
>   StoreFileReader reader = fileInfo.preStoreFileReaderOpen(context, 
> cacheConf);
>   if (reader == null) {
>     reader = fileInfo.createReader(context, cacheConf);
>     fileInfo.getHFileInfo().initMetaAndIndex(reader.getHFileReader());
>   }
> ....{code}
> only createReader() unbuffered the stream. In initMetaAndIndex(), using the 
> stream to read blocks, so it needs to unbuffer() the socket , too.
> We can just add try before fileInfo.initHFileInfo(context); and finally 
> unbuffer() the stream at the end of the open() function.
> We fixed it on our cluster, the number of CLOSE_WAIT reduced to about 0. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to