[ 
https://issues.apache.org/jira/browse/HDFS-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JamesLi updated HDFS-5671:
--------------------------

    Description: 
lsof -i TCP:1004 | grep -c CLOSE_WAIT
18235
When hbase regionserver request a file's block to DataNode:1004. If request 
fail because "java.io.IOException: Got error for OP_READ_BLOCK,Block token is 
expired." Occurs  and the TCP socket that regionserver using is not closed.

I think the problem above is in DatanodeInfo blockSeekTo(long target)  of Class 
DFSInputStream 
The connection regionserver using is BlockReader: 
        blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
            accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
            buffersize, verifyChecksum, dfsClient.clientName);
and if this connection fail, regionserver will fetch a new access token , and 
old Connection is not closed here. 
I think need small code to close old Connection when exception happens:
        if(blockReader != null)
                try{
                        blockReader.close();
                        blockReader = null;
                } catch (IOException exc) {
                        DFSClient.LOG.error("Close connection to " + targetAddr 
+ " failed");
                } 

  was:
lsof -i TCP:1004 | grep -c CLOSE_WAIT
18235
When hbase regionserver request a file's block to DataNode:1004. If request 
fail because "java.io.IOException: Got error for OP_READ_BLOCK,Block token is 
expired." Occurs  and the TCP socket that regionserver using is not closed.

I think the problem above is in DatanodeInfo blockSeekTo(long target) - line 
539 of Class DFSInputStream 
The connection regionserver using is BlockReader, it created on line 574 : 
        blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
            accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
            buffersize, verifyChecksum, dfsClient.clientName);
and if this connection fail, regionserver will fetch a new access token , and 
old Connection is not closed here. 
I think need small code to close old Connection when exception happens:
        if(blockReader != null)
                try{
                        blockReader.close();
                        blockReader = null;
                } catch (IOException exc) {
                        DFSClient.LOG.error("Close connection to " + targetAddr 
+ " failed");
                } 


> When Hbase RegionServer request block to DataNode and "java.io.IOException" 
> occurs, the fail TCP socket is not closed (in status "CLOSE_WAIT" with port 
> 1004 of DataNode)
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5671
>                 URL: https://issues.apache.org/jira/browse/HDFS-5671
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.2.0
>         Environment: hadoop-2.2.0
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
> Linux 2.6.32-358.14.1.el6.x86_64 #1 SMP Tue Jul 16 23:51:20 UTC 2013 x86_64 
> x86_64 x86_64 GNU/Linux
>            Reporter: JamesLi
>            Priority: Critical
>
> lsof -i TCP:1004 | grep -c CLOSE_WAIT
> 18235
> When hbase regionserver request a file's block to DataNode:1004. If request 
> fail because "java.io.IOException: Got error for OP_READ_BLOCK,Block token is 
> expired." Occurs  and the TCP socket that regionserver using is not closed.
> I think the problem above is in DatanodeInfo blockSeekTo(long target)  of 
> Class DFSInputStream 
> The connection regionserver using is BlockReader: 
>         blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
>             accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
>             buffersize, verifyChecksum, dfsClient.clientName);
> and if this connection fail, regionserver will fetch a new access token , and 
> old Connection is not closed here. 
> I think need small code to close old Connection when exception happens:
>       if(blockReader != null)
>               try{
>                       blockReader.close();
>                       blockReader = null;
>               } catch (IOException exc) {
>                       DFSClient.LOG.error("Close connection to " + targetAddr 
> + " failed");
>               } 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to