[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035644#comment-16035644 ]
Mingliang Liu commented on HADOOP-14478: ---------------------------------------- +1 pending on Jenkins. Will commit next Monday if no more input. > Optimize NativeAzureFsInputStream for positional reads > ------------------------------------------------------ > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org