[jira] Commented: (HADOOP-1924) [hbase] TestDFSAbort failed in nightly #242

Doug Cutting (JIRA) Fri, 05 Oct 2007 10:55:41 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532737
 ]


Doug Cutting commented on HADOOP-1924:
--------------------------------------

It seems that setSoTimeout only affects reads, not writes.  So you're right, 
there is no way in Java to set a write timeout!

One theory is that the server, while stopped, may not in fact be closing all 
its connections.  I couldn't see where that was done just now when I looked.  
Handlers are interrupted, but they don't close their connection on interrupt.  
The listener thread calls cleanupConnections(true), but only on 
OutOfMemoryException, not in a 'finally' clause.  And, even then 
cleanupConnections(true) doesn't look like it closes connections that have been 
recently active.

So please check the server's logs to see if for each "Server connection from" 
line there is a corresponding "disconnecting client" line.  If there's not, 
then this could be the problem.

Some potentially relevant discussions:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4283017
http://forum.java.sun.com/thread.jspa?threadID=5203832&tstart=75
http://www-1.ibm.com/support/docview.wss?rs=180&uid=swg1PK37506
http://archives.java.sun.com/cgi-bin/wa?A2=ind0212&L=rmi-users&P=731

Other possible things to try:
- call Socket#setKeepAlive() on IPC sockets
- try calling Thread#interrrupt(), it may help...
- adjust some of the TCP parameters on lucene.zones.apache.org


> [hbase] TestDFSAbort failed in nightly #242
> -------------------------------------------
>
>                 Key: HADOOP-1924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1924
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>         Attachments: testdfsabort.patch, testdfsabort_patchbuild798.txt
>
>
> TestDFSAbort and TestBloomFilters failed in last nights nightly build (#242). 
>  This issue is about trying to figure whats up w/ TDFSA.
> Studying console logs, HRegionServer stopped logging any activity and HMaster 
> for its part did not expire the HRegionServer lease.  On top of it all, 
> continued tests of the state of HDFS -- the test is meant to sure Hbase 
> shutdown when HDFS is pulled from under it -- seems to have continued 
> reporting itself healthy though it'd be closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1924) [hbase] TestDFSAbort failed in nightly #242

Reply via email to