[ 
https://issues.apache.org/jira/browse/HDFS-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-9146:
--------------------------------------
    Component/s:     (was: HDFS)
                 hdfs-client

> HDFS forward seek() within a block shouldn't spawn new TCP 
> Peer/RemoteBlockReader
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-9146
>                 URL: https://issues.apache.org/jira/browse/HDFS-9146
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.6.0, 2.8.0, 2.7.1
>            Reporter: Gopal V
>
> When a seek() + forward readFully() is triggered from a remote dfsclient, 
> HDFS opens a new remote block reader even if the seek is within the same HDFS 
> block.
> (analysis from [~rajesh.balamohan])
> This is due to the fact that a simple read operation assumes that the user is 
> going to read till the end of the block.
> {code}
>       try {
>         blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>             targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>             storageType, chosenNode);
> {code}
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L624
> Since the user hasn't read till the end of the block when the next seek 
> happens, the BlockReader assumes this is an aborted read and tries to throw 
> away the TCP peer it has got.
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java#L324
> {code}
>     // If we've now satisfied the whole client read, read one last packet
>     // header, which should be empty
>     if (bytesNeededToFinish <= 0) {
>       readTrailingEmptyPacket(); 
>      ...
>           sendReadResult(Status.SUCCESS);
> {code}
> Since that is not satisfied, the status code is unset & the peer is not 
> returned to the cache.
> {code}
>     if (peerCache != null && sentStatusCode) {
>       peerCache.put(datanodeID, peer);
>     } else {
>       peer.close();
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to