[ 
https://issues.apache.org/jira/browse/HAWQ-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101774#comment-15101774
 ] 

Ruilong Huo commented on HAWQ-348:
----------------------------------

Here is the initial analysis for the 20x performance difference in the example:

1. About 9x performance downgrade since (table) function is preprocessed at 
planning phase (on 8 virtual segment HAWQ cluster)
{noformat}
Let t_seg be the time that all 8 QEs complete the scan in "select * from t" 
with in the (table) function
Let t_mas be the time that QD get the result of (table) function from all 8 QEs 
through motion

We can get: t_mas is approximately 8 * t_seg

Given the time of the aggregation (sum) on QD is trivial since it can be done 
along the receiving (table) function result from motion

The total time to complete the query is: t_tot = t_seg + t_mas = 9 * t_seg
{noformat}

2. 2x performance downgrade since we are doing two phase planning in Apache 
HAWQ. This issue has been tracked by HAWQ-198.
{noformat}
Phase 1: get initial plan using planner to estimate the cost of the query
Phase 2: calculate datalocality and allocate resource based on the cost of the 
query
Phase 3: get final plan for execution using planner/optimizer based on the 
resource actually allocated in Phase 2

Here we actually execute the (table) function twice since it is done in 
planning phase and we have planning phase in both Phase 1 and Phase 3.
{noformat}

>From 1, 2, we can estimate the total performance difference is 9x * 2x = 18x. 
>It is very close to the performance data (20x) we get from the example.

> Optimizer (ORCA/Planner) should not preprocess (table) functions at planning 
> phase
> ----------------------------------------------------------------------------------
>
>                 Key: HAWQ-348
>                 URL: https://issues.apache.org/jira/browse/HAWQ-348
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Optimizer
>            Reporter: Ruilong Huo
>            Assignee: Amr El-Helw
>
> Optimizer (ORCA/Planner) currently preprocesses (table) functions (either in 
> target list or from clause) at planning phase. This introduces:
> 1. Much lower performance since the result of the (table) function is 
> motioned to QD from QEs after the preprocessing, and it is further processed 
> there at QD, especially the result is large. In this case, QD does heavy 
> workload and becomes the bottleneck. It shows about 20x performance 
> difference in below example.
> 2. Much more memory overhead at QD since it needs to hold the result of the 
> (table) function. This is risky since the result might be unpredictably large.
> Here are the steps to reproduce this issue, as well as some initial analysis:
> Step 1: Prepare schema and data
> {noformat}
> CREATE TABLE t (id INT);
> CREATE TABLE
> INSERT INTO t SELECT generate_series(1, 100000000);
> INSERT 0 100000000
> CREATE OR REPLACE FUNCTION get_t()
> RETURNS SETOF t
> LANGUAGE SQL AS
> 'SELECT * FROM t'
> STABLE;
> CREATE FUNCTION
> {noformat}
> 2. With optimizer = OFF (Planner)
> {noformat}
> SET optimizer='OFF';
> SET
> select sum(id) from t;
>        sum
> ------------------
>  5000000050000000
> (1 row)
> Time: 8801.577 ms
> select sum(id) from get_t();
>        sum
> ------------------
>  5000000050000000
> (1 row)
> Time: 189992.273 ms
> EXPLAIN SELECT sum(id) FROM get_t();
>                              QUERY PLAN
> --------------------------------------------------------------------
>  Aggregate  (cost=32.50..32.51 rows=1 width=8)
>    ->  Function Scan on get_t  (cost=0.00..12.50 rows=8000 width=4)
>  Settings:  default_segment_num=8; optimizer=off
>  Optimizer status: legacy query optimizer
> (4 rows)
> {noformat}
> 3. With optimizer = ON (ORCA)
> {noformat}
> SET optimizer='ON';
> SET
> select sum(id) from t;
>        sum
> ------------------
>  5000000050000000
> (1 row)
> Time: 10103.436 ms
> select sum(id) from get_t();
>        sum
> ------------------
>  5000000050000000
> (1 row)
> Time: 195551.740 ms
> EXPLAIN SELECT sum(id) FROM get_t();
>                              QUERY PLAN
> --------------------------------------------------------------------
>  Aggregate  (cost=32.50..32.51 rows=1 width=8)
>    ->  Function Scan on get_t  (cost=0.00..12.50 rows=8000 width=4)
>  Settings:  default_segment_num=8
>  Optimizer status: legacy query optimizer
> (4 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to