[ 
https://issues.apache.org/jira/browse/HADOOP-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010923#comment-18010923
 ] 

ASF GitHub Bot commented on HADOOP-19394:
-----------------------------------------

ahmarsuhail commented on code in PR #7720:
URL: https://github.com/apache/hadoop/pull/7720#discussion_r2242760600


##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractAnalyticsStreamVectoredRead.java:
##########
@@ -63,12 +75,35 @@ protected Configuration createConfiguration() {
     // This issue is tracked in:
     // https://github.com/awslabs/analytics-accelerator-s3/issues/218
     skipForAnyEncryptionExceptSSES3(conf);
-    conf.set("fs.contract.vector-io-early-eof-check", "false");

Review Comment:
   yeah, before we implemented this we we were using the base implementation 
which does not do any early checks. Now it's the same the current 
implementation, which will validate the ranges before doing any reads, and 
throw an EoF for any ranges > file size.





> S3A Analytics Accelerator: vector IO support
> --------------------------------------------
>
>                 Key: HADOOP-19394
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19394
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.1
>            Reporter: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Add vector IO support for analytics accelerator stream
> Three stages
> # pull up s3a input stream to work with all ObjectInputStreams; do its own 
> fetching independent of the analytics stream
> # provide info to stream of fetches having taken place (remove from cache, 
> cancel prefetch)
> full integration
> * return a range from cache if present
> * append to the block retrieval callback if a prefetch is in progress
> * only do merge + new request if the range cannot be satisifed entirely from 
> cached data
> Out of scope: handling case where part of a range is in cache/retrieval. Too 
> complicated and so prone to problems. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to