[
https://issues.apache.org/jira/browse/IMPALA-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Rodoni updated IMPALA-8490:
--------------------------------
Labels: in_33 (was: future_release_doc in_33)
> Impala Doc: the file handle cache now supports S3
> -------------------------------------------------
>
> Key: IMPALA-8490
> URL: https://issues.apache.org/jira/browse/IMPALA-8490
> Project: IMPALA
> Issue Type: Sub-task
> Components: Docs
> Reporter: Sahil Takiar
> Assignee: Alex Rodoni
> Priority: Major
> Labels: in_33
> Fix For: Impala 3.3.0
>
>
> https://impala.apache.org/docs/build/html/topics/impala_scalability.html
> state:
> {quote}
> Because this feature only involves HDFS data files, it does not apply to
> non-HDFS tables, such as Kudu or HBase tables, or tables that store their
> data on cloud services such as S3 or ADLS.
> {quote}
> This section should be updated because the file handle cache now supports S3
> files.
> We should add a section to the docs similar to what we added when support for
> remote HDFS files was added to the file handle cache:
> {quote}
> In Impala 3.2 and higher, file handle caching also applies to remote HDFS
> file handles. This is controlled by the cache_remote_file_handles flag for an
> impalad. It is recommended that you use the default value of true as this
> caching prevents your NameNode from overloading when your cluster has many
> remote HDFS reads.
> {quote}
> Like {{cache_remote_file_handles}} the flag {{cache_s3_file_handles}} has
> been added as an impalad startup option (the flag is enabled by default).
> Unlike HDFS though, S3 has no NameNode, the benefit is that it eliminate a
> call to {{getFileStatus()}} on the target S3 file. So "prevents your NameNode
> from overloading when your cluster has many remote HDFS reads" should be
> changed to something like "avoids an unnecessary call to
> S3AFileSystem#getFileStatus() which reduces the number of API calls made to
> S3."
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]