[ 
https://issues.apache.org/jira/browse/IMPALA-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6458:
----------------------------------
    Component/s: Frontend

> Create scan ranges in the coordinator
> -------------------------------------
>
>                 Key: IMPALA-6458
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6458
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 2.11.0
>            Reporter: Vuk Ercegovac
>            Priority: Major
>
> Currently, scan ranges are generated in the frontend planner and sent to the 
> coordinator (for assignment, evaluation, etc)
> For example, HDFSScanNode transforms all file blocks into TScanRanges that 
> are associated with a query execution request that is sent to the 
> coordinator. The file blocks are represented using FlatBuffers and the 
> resulting scan ranges are represented more verbosely using thrift. As a 
> result, we're using more memory and time to serialize the scan ranges from 
> the front end to the coordinators than needed. Instead, the block information 
> should be sent over to coordinators as-is and the coordinator should expand 
> them into the scan ranges that the backend executors process. A simpler 
> example shows up for IMPALA-5931 where scan ranges are synthesized in the 
> coordinator for files that do not have block information (S3, ADLS, Local).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to