[
https://issues.apache.org/jira/browse/IMPALA-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vuk Ercegovac closed IMPALA-5931.
---------------------------------
Resolution: Fixed
Fix Version/s: Impala 3.1.0
Impala 2.13.0
> Don't synthesize block metadata in the catalog for S3/ADLS
> ----------------------------------------------------------
>
> Key: IMPALA-5931
> URL: https://issues.apache.org/jira/browse/IMPALA-5931
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Dan Hecht
> Assignee: Vuk Ercegovac
> Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Today, the catalog synthesizes block metadata for S3/ADLS by just breaking up
> splittable files into "blocks" with the FileSystem's default block size.
> Rather than carrying these blocks around in the catalog and distributing them
> to all impalad's, we might as well generate the scan ranges on-the-fly during
> planning. That would save the memory and network bandwidth of blocks.
> That does mean that the planner will have to instantiate and call the
> filesystem to get the default block size, but for these FileSystem's, that's
> just a matter of reading the config.
> Perhaps the same can be done for HDFS erasure coding, though that depends on
> what a block location actually means in that context and whether they contain
> useful info.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)