[
https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115799#comment-15115799
]
stack commented on HBASE-9393:
------------------------------
bq. I believe there is an option do to #1 even right now. Can't HBase be
configured just to use pread and never read?
We want sequential reading when doing long scans (the purported hdfs i/o
'pipeliniing'). We want to be able to pick and choose dependent on read-type
(short scan or random get vs streaming scan..).
This issue and suggestion offlist by [~Apache9] brings up the unfinished
project, https://issues.apache.org/jira/browse/HBASE-5979, which is the proper
way to fix what is going on in here (as well as doing proper separation of long
vs short read). Would be good to revive. There is good stuff in the cited issue.
Adding the below as finally in a method named pickReaderVersion seems a bit
odd... is pickReaderVersion only place we read in the file trailer? That seems
odd (not your issue [~ashish singhi]). You'd think we'd want to keep the
trailer around in the reader.
522 } finally {
523 unbufferStream(fsdis);
524 }
525 }
On commit, lets point this issue as to why we are doing gymnastics in
unbufferStream method... and why the reflection.
Is it odd adding this unbufferStream to hbase types when there is the Interface
CanUnbuffer up in hdfs? Should we have a local hbase equivalent... and put it
on HFileBlock, HFileReader... Then the relation is more clear? Perhaps overkill?
Why you think the sequentialRead numbers are so different in your perf test
above [~ashish singhi]? The extra setup after reading in the trailer?
bq. TestStochasticLoadBalancer failure was not related to the change - it has
failed intermittently.
[[email protected]] Let me retry the patch. We need clean build to commit...
for any patch. No more, '... it passes for me locally...'. It has to pass up
here on apache. If we can't get it to pass, nothing should get checked in until
tests are fixed. Otherwise our test suite is for nought and the running of CI
just wasted energy at the DC.
> Hbase does not closing a closed socket resulting in many CLOSE_WAIT
> --------------------------------------------------------------------
>
> Key: HBASE-9393
> URL: https://issues.apache.org/jira/browse/HBASE-9393
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.2, 0.98.0
> Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node,
> 7279 regions
> Reporter: Avi Zrachya
> Assignee: Ashish Singhi
> Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-9393.patch, HBASE-9393.v1.patch,
> HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch,
> HBASE-9393.v5.patch, HBASE-9393.v5.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect
> to the datanode because too many mapped sockets from one host to another on
> the same port.
> The example below is with low CLOSE_WAIT count because we had to restart
> hbase to solve the porblem, later in time it will incease to 60-100K sockets
> on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root 17255 17219 0 12:26 pts/0 00:00:00 grep 21592
> hbase 21592 1 17 Aug29 ? 03:29:06
> /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m
> -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
> -Dhbase.log.dir=/var/log/hbase
> -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)