[
https://issues.apache.org/jira/browse/HDFS-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
JamesLi updated HDFS-5671:
--------------------------
Description:
lsof -i TCP:1004 | grep -c CLOSE_WAIT
18235
When client request a file's block to DataNode:1004. If request fail because
"java.io.IOException: Got error for OP_READ_BLOCK,Block token is expired."
Occurs and the TCP socket that regionserver using is not closed.
I think the problem above is in DatanodeInfo blockSeekTo(long target) of Class
DFSInputStream
The connection client using is BlockReader:
blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
buffersize, verifyChecksum, dfsClient.clientName);
and if this connection fail, client will fetch a new access token , and old
Connection is not closed here.
was:
lsof -i TCP:1004 | grep -c CLOSE_WAIT
18235
When hbase regionserver request a file's block to DataNode:1004. If request
fail because "java.io.IOException: Got error for OP_READ_BLOCK,Block token is
expired." Occurs and the TCP socket that regionserver using is not closed.
I think the problem above is in DatanodeInfo blockSeekTo(long target) of Class
DFSInputStream
The connection regionserver using is BlockReader:
blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
buffersize, verifyChecksum, dfsClient.clientName);
and if this connection fail, regionserver will fetch a new access token , and
old Connection is not closed here.
I think need small code to close old Connection when exception happens:
if(blockReader != null)
try{
blockReader.close();
blockReader = null;
} catch (IOException exc) {
DFSClient.LOG.error("Close connection to " + targetAddr
+ " failed");
}
> When client request block to DataNode and "java.io.IOException" occurs, the
> fail TCP socket is not closed (in status "CLOSE_WAIT" with port 1004 of
> DataNode)
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-5671
> URL: https://issues.apache.org/jira/browse/HDFS-5671
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.2.0
> Environment: hadoop-2.2.0
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
> Linux 2.6.32-358.14.1.el6.x86_64 #1 SMP Tue Jul 16 23:51:20 UTC 2013 x86_64
> x86_64 x86_64 GNU/Linux
> Reporter: JamesLi
> Priority: Critical
> Attachments: 5671.patch, 5671v1.patch
>
>
> lsof -i TCP:1004 | grep -c CLOSE_WAIT
> 18235
> When client request a file's block to DataNode:1004. If request fail because
> "java.io.IOException: Got error for OP_READ_BLOCK,Block token is expired."
> Occurs and the TCP socket that regionserver using is not closed.
> I think the problem above is in DatanodeInfo blockSeekTo(long target) of
> Class DFSInputStream
> The connection client using is BlockReader:
> blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
> accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
> buffersize, verifyChecksum, dfsClient.clientName);
> and if this connection fail, client will fetch a new access token , and old
> Connection is not closed here.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)