[jira] [Commented] (HADOOP-12444) Consider implementing lazy seek in S3AInputStream

Hadoop QA (JIRA) Thu, 15 Oct 2015 21:54:53 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960168#comment-14960168
 ]


Hadoop QA commented on HADOOP-12444:
------------------------------------

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | javac |   7m 50s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764570/HADOOP-12444.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / cf23f2c |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7833/console |


This message was automatically generated.

> Consider implementing lazy seek in S3AInputStream
> -------------------------------------------------
>
>                 Key: HADOOP-12444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12444
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.7.1
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: HADOOP-12444.1.patch, HADOOP-12444.WIP.patch
>
>
> - Currently, "read(long position, byte[] buffer, int offset, int length)" is 
> not implemented in S3AInputStream (unlike DFSInputStream). So, 
> "readFully(long position, byte[] buffer, int offset, int length)" in 
> S3AInputStream goes through the default implementation of seek(), read(), 
> seek() in FSInputStream. 
> - However, seek() in S3AInputStream involves re-opening of connection to S3 
> everytime 
> (https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L115).
>   
> - It would be good to consider having a lazy seek implementation to reduce 
> connection overheads to S3. (e.g Presto implements lazy seek. 
> https://github.com/facebook/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/PrestoS3FileSystem.java#L623)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12444) Consider implementing lazy seek in S3AInputStream

Reply via email to