[
https://issues.apache.org/jira/browse/IMPALA-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Ho reassigned IMPALA-4475:
----------------------------------
Assignee: (was: Vuk Ercegovac)
> Compress ExecPlanFragment before shipping it to worker nodes to reduce
> network traffic
> --------------------------------------------------------------------------------------
>
> Key: IMPALA-4475
> URL: https://issues.apache.org/jira/browse/IMPALA-4475
> Project: IMPALA
> Issue Type: New Feature
> Components: Distributed Exec
> Affects Versions: Impala 2.6.0
> Reporter: Mostafa Mokhtar
> Priority: Major
> Labels: ramp-up, scalability
> Attachments: count_store_returns.txt.zip,
> slow_query_start_250K_partitions_134nodes.txt
>
>
> Sending the ExecPlanFragment to remote nodes dominates the query startup time
> on clusters larger than 100 nodes, size of the ExecPlanFragment grows with
> number of tables, blocks and partitions in the table.
> On large cluster this is limits query throughput.
> From TPC-DS Q11 on 1K node cluster
> {code}
> Query Timeline: 5m6s
> - Query submitted: 75.256us (75.256us)
> - Planning finished: 1s580ms (1s580ms)
> - Submit for admission: 2s376ms (795.652ms)
> - Completed admission: 2s377ms (1.512ms)
> - Ready to start 15993 fragment instances: 2s458ms (80.378ms)
> - First dynamic filter received: 2m35s (2m33s)
> - All 15993 fragment instances started: 2m35s (40.934ms)
> - Rows available: 4m53s (2m17s)
> - First row fetched: 4m53s (176.254ms)
> - Unregister query: 4m58s (4s828ms)
> - ComputeScanRangeAssignmentTimer: 600.086ms
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]