[ https://issues.apache.org/jira/browse/HADOOP-18184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743209#comment-17743209 ]
ASF GitHub Bot commented on HADOOP-18184: ----------------------------------------- steveloughran commented on PR #5832: URL: https://github.com/apache/hadoop/pull/5832#issuecomment-1636071189 HADOOP-18184. S3A prefetch unbuffer * Lots of statistic collection with use in tests. * s3a prefetch tests all moved to prefetch. package * and split into caching stream and large files tests * large files and LRU are scale * and testRandomReadLargeFile uses small block size to reduce read overhead * new hadoop common org.apache.hadoop.test.Sizes sizes class with predefined sizes (from azure; not moved existing code to it yet) Overall, the prefetch reads of the large files are slow; while it's critical to test multi-block files, we don't need to work on the landsat csv file. better: one of the huge tests uses it, with a small block size of 1 MB to force lots of work. > s3a prefetching stream to support unbuffer() > -------------------------------------------- > > Key: HADOOP-18184 > URL: https://issues.apache.org/jira/browse/HADOOP-18184 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Minor > Labels: pull-request-available > > Apache Impala uses unbuffer() to free up all client side resources held by a > stream, so allowing it to have a map of available (path -> stream) objects, > retained across queries. > This saves on having to reopen the files, with the cost of HEAD checks etc. > S3AInputStream just closes its http connection. here there is a lot more > state to discard, but all memory and file storage must be freed. > until this done, ITestS3AContractUnbuffer must skip when the prefetch stream > is used. > its notable that the other tests don't fail, even though the stream doesn't > implement the interface; the graceful degradation handles that. it should > fail if the test xml resource says the stream does it, but that the stream > capabilities say it doesn't. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org