[jira] [Updated] (HDFS-9146) HDFS forward seek() within a block shouldn't spawn new TCP Peer/RemoteBlockReader
[ https://issues.apache.org/jira/browse/HDFS-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HDFS-9146: -- Affects Version/s: 3.0.0 > HDFS forward seek() within a block shouldn't spawn new TCP > Peer/RemoteBlockReader > - > > Key: HDFS-9146 > URL: https://issues.apache.org/jira/browse/HDFS-9146 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0, 2.8.0, 2.7.1, 3.0.0 >Reporter: Gopal V > > When a seek() + forward readFully() is triggered from a remote dfsclient, > HDFS opens a new remote block reader even if the seek is within the same HDFS > block. > (analysis from [~rajesh.balamohan]) > This is due to the fact that a simple read operation assumes that the user is > going to read till the end of the block. > {code} > try { > blockReader = getBlockReader(targetBlock, offsetIntoBlock, > targetBlock.getBlockSize() - offsetIntoBlock, targetAddr, > storageType, chosenNode); > {code} > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L624 > Since the user hasn't read till the end of the block when the next seek > happens, the BlockReader assumes this is an aborted read and tries to throw > away the TCP peer it has got. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java#L324 > {code} > // If we've now satisfied the whole client read, read one last packet > // header, which should be empty > if (bytesNeededToFinish <= 0) { > readTrailingEmptyPacket(); > ... > sendReadResult(Status.SUCCESS); > {code} > Since that is not satisfied, the status code is unset & the peer is not > returned to the cache. > {code} > if (peerCache != null && sentStatusCode) { > peerCache.put(datanodeID, peer); > } else { > peer.close(); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9146) HDFS forward seek() within a block shouldn't spawn new TCP Peer/RemoteBlockReader
[ https://issues.apache.org/jira/browse/HDFS-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-9146: -- Component/s: (was: HDFS) hdfs-client > HDFS forward seek() within a block shouldn't spawn new TCP > Peer/RemoteBlockReader > - > > Key: HDFS-9146 > URL: https://issues.apache.org/jira/browse/HDFS-9146 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0, 2.8.0, 2.7.1 >Reporter: Gopal V > > When a seek() + forward readFully() is triggered from a remote dfsclient, > HDFS opens a new remote block reader even if the seek is within the same HDFS > block. > (analysis from [~rajesh.balamohan]) > This is due to the fact that a simple read operation assumes that the user is > going to read till the end of the block. > {code} > try { > blockReader = getBlockReader(targetBlock, offsetIntoBlock, > targetBlock.getBlockSize() - offsetIntoBlock, targetAddr, > storageType, chosenNode); > {code} > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L624 > Since the user hasn't read till the end of the block when the next seek > happens, the BlockReader assumes this is an aborted read and tries to throw > away the TCP peer it has got. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java#L324 > {code} > // If we've now satisfied the whole client read, read one last packet > // header, which should be empty > if (bytesNeededToFinish <= 0) { > readTrailingEmptyPacket(); > ... > sendReadResult(Status.SUCCESS); > {code} > Since that is not satisfied, the status code is unset & the peer is not > returned to the cache. > {code} > if (peerCache != null && sentStatusCode) { > peerCache.put(datanodeID, peer); > } else { > peer.close(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)