[ 
https://issues.apache.org/jira/browse/IMPALA-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206210#comment-17206210
 ] 

ASF subversion and git services commented on IMPALA-8428:
---------------------------------------------------------

Commit d09294a4a6da142bf1bed7cf9b769c07d6d7baa0 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d09294a ]

IMPALA-10202: Enable file handle cache for ABFS files

Like IMPALA-8428, but for ABFS, instead of S3A. Adds support for adding
ABFS file handles to the file handle cache. Support for ABFSInputStream
unbuffer operations was added in HADOOP-16859.

Ran a full table scan of a 1GB store_sales table on ABFS, made
sure the file handles were cached (validated via the runtime
profile); did this multiple times, against several different
copies of the store_sales table, in order to increase the number
of file handles cached by an impalad.

Tested:
* Tested against a ABFS storage account I have access to

Change-Id: I64f12f832980f4e0207af78368402dd09e370fc3
Reviewed-on: http://gerrit.cloudera.org:8080/16532
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Add support for caching file handles on s3
> ------------------------------------------
>
>                 Key: IMPALA-8428
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8428
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.3.0
>            Reporter: Joe McDonnell
>            Assignee: Sahil Takiar
>            Priority: Critical
>             Fix For: Impala 3.3.0
>
>
> The file handle cache is currently disabled for S3, as the S3 connector 
> needed to implement proper unbuffer support. Now that 
> https://issues.apache.org/jira/browse/HADOOP-14747 is fixed, Impala should 
> provide an option to cache S3 file handles.
> This is particularly important for data caching, as accessing the data cache 
> happens after obtaining a file handle. If getting a file handle is slow, the 
> caching will be less effective.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to