[
https://issues.apache.org/jira/browse/IMPALA-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-6458:
----------------------------------
Component/s: Frontend
> Create scan ranges in the coordinator
> -------------------------------------
>
> Key: IMPALA-6458
> URL: https://issues.apache.org/jira/browse/IMPALA-6458
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 2.11.0
> Reporter: Vuk Ercegovac
> Priority: Major
>
> Currently, scan ranges are generated in the frontend planner and sent to the
> coordinator (for assignment, evaluation, etc)
> For example, HDFSScanNode transforms all file blocks into TScanRanges that
> are associated with a query execution request that is sent to the
> coordinator. The file blocks are represented using FlatBuffers and the
> resulting scan ranges are represented more verbosely using thrift. As a
> result, we're using more memory and time to serialize the scan ranges from
> the front end to the coordinators than needed. Instead, the block information
> should be sent over to coordinators as-is and the coordinator should expand
> them into the scan ranges that the backend executors process. A simpler
> example shows up for IMPALA-5931 where scan ranges are synthesized in the
> coordinator for files that do not have block information (S3, ADLS, Local).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]