[jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

stack (JIRA) Mon, 25 Jan 2016 11:21:17 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115799#comment-15115799
 ]


stack commented on HBASE-9393:
------------------------------

bq. I believe there is an option do to #1 even right now. Can't HBase be 
configured just to use pread and never read?

We want sequential reading when doing long scans (the purported hdfs i/o 
'pipeliniing'). We want to be able to pick and choose dependent on read-type 
(short scan or random get vs streaming scan..).

This issue and suggestion offlist by [~Apache9] brings up the unfinished 
project, https://issues.apache.org/jira/browse/HBASE-5979, which is the proper 
way to fix what is going on in here (as well as doing proper separation of long 
vs short read). Would be good to revive. There is good stuff in the cited issue.

Adding the below as finally in a method named pickReaderVersion seems a bit 
odd... is pickReaderVersion only place we read in the file trailer? That seems 
odd (not your issue [~ashish singhi]). You'd think we'd want to keep the 
trailer around in the reader.

522         } finally {
523           unbufferStream(fsdis);
524         }
525       }

On commit, lets point this issue as to why we are doing gymnastics in 
unbufferStream method... and why the reflection.

Is it odd adding this unbufferStream to hbase types when there is the Interface 
CanUnbuffer up in hdfs? Should we have a local hbase equivalent... and put it 
on HFileBlock, HFileReader... Then the relation is more clear? Perhaps overkill?

Why you think the sequentialRead numbers are so different in your perf test 
above [~ashish singhi]? The extra setup after reading  in the trailer?

bq. TestStochasticLoadBalancer failure was not related to the change - it has 
failed intermittently.
[[email protected]] Let me retry the patch. We need clean build to commit... 
for any patch. No more, '... it passes for me locally...'. It has to pass up 
here on apache. If we can't get it to pass, nothing should get checked in until 
tests are fixed. Otherwise our test suite is for nought and the running of CI 
just wasted energy at the DC.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 
> 7279 regions
>            Reporter: Avi Zrachya
>            Assignee: Ashish Singhi
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, 
> HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch, 
> HBASE-9393.v5.patch, HBASE-9393.v5.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect 
> to the datanode because too many mapped sockets from one host to another on 
> the same port.
> The example below is with low CLOSE_WAIT count because we had to restart 
> hbase to solve the porblem, later in time it will incease to 60-100K sockets 
> on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 
> /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m 
> -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
> -Dhbase.log.dir=/var/log/hbase 
> -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

Reply via email to