[ 
https://issues.apache.org/jira/browse/HDFS-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850152#comment-13850152
 ] 

JamesLi commented on HDFS-5671:
-------------------------------

@stack I have add the patch file,but we haven't test yet. According to our 
datanode and regionserver's error log,it appears that client side does not 
close the connection when datanode throw IOException and close the connection.
Here are our error logs,hope will help you:

RegionServer:
2013-12-13 15:48:31,474 INFO org.apache.hadoop.hdfs.DFSClient: Will fetch a new 
access token and retry, access token was invalid when connecting to 
/192.168.2.27:1004 : 
org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got 
access token error for OP_READ_BLOCK, self=/192.168.2.27:56975, 
remote=/192.168.2.27:1004, for file 
/hbase/XXXX/b50bf1b95c9242cdd242dc4e6549bc90/raw/d59819ebe5574c79a5d1cf13a733d2ed,
 for pool BP-621472495-192.168.2.25-1375176775166 block 
-882505774551713967_11426277 

Datanode:
2013-12-13 15:48:31,474 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
dn2:1004:DataXceiver error processing READ_BLOCK operation  src: 
/192.168.2.27:56975 dest: /192.168.2.27:1004
org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
block_token_identifier (expiryDate=1386914547771, keyId=2020397153, 
userId=hbase, blockPoolId=BP-621472495-192.168.2.25-1375176775166, 
blockId=-882505774551713967, access modes=[READ]) is expired.
        at 
org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java)
        at 
org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java)
        at 
org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java)
        at java.lang.Thread.run(Thread.java)


> When Hbase RegionServer request block to DataNode and "java.io.IOException" 
> occurs, the fail TCP socket is not closed (in status "CLOSE_WAIT" with port 
> 1004 of DataNode)
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5671
>                 URL: https://issues.apache.org/jira/browse/HDFS-5671
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.2.0
>         Environment: hadoop-2.2.0
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
> Linux 2.6.32-358.14.1.el6.x86_64 #1 SMP Tue Jul 16 23:51:20 UTC 2013 x86_64 
> x86_64 x86_64 GNU/Linux
>            Reporter: JamesLi
>            Priority: Critical
>         Attachments: 5671.patch
>
>
> lsof -i TCP:1004 | grep -c CLOSE_WAIT
> 18235
> When hbase regionserver request a file's block to DataNode:1004. If request 
> fail because "java.io.IOException: Got error for OP_READ_BLOCK,Block token is 
> expired." Occurs  and the TCP socket that regionserver using is not closed.
> I think the problem above is in DatanodeInfo blockSeekTo(long target)  of 
> Class DFSInputStream 
> The connection regionserver using is BlockReader: 
>         blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
>             accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
>             buffersize, verifyChecksum, dfsClient.clientName);
> and if this connection fail, regionserver will fetch a new access token , and 
> old Connection is not closed here. 
> I think need small code to close old Connection when exception happens:
>       if(blockReader != null)
>               try{
>                       blockReader.close();
>                       blockReader = null;
>               } catch (IOException exc) {
>                       DFSClient.LOG.error("Close connection to " + targetAddr 
> + " failed");
>               } 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to