Hi, Jiali Yao,
Thanks for you reply.
Here is the detail information:
1. the segment configrations:
# select * from gp_segment_configuration ;
registration_order | role | status | port | hostname | address
--------------------+------+--------+-------+----------+------------
0 | m | u | 25432 | dserver1 | dserver1
1 | p | u | 40404 | dserver5 | 10.10.0.15
2 | p | u | 40404 | dserver3 | 10.10.0.13
3 | p | u | 40404 | dserver1 | 10.10.0.11
4 | p | u | 40404 | dserver4 | 10.10.0.14
5 | p | u | 40404 | dserver2 | 10.10.0.12
(6 rows)
2. The "explain analyze" about the query, see the attachement.
3. No, this query was tested *without YARN*.
Thanks
On Fri, Nov 27, 2015 at 4:59 PM, Jiali Yao <[email protected]> wrote:
> Hi Leon
>
> Thanks for providing it. The result is not as we expected. In our
> performance test, we found the performance is comparable with 1.3.
> Could you please some more information:
> 1. Get segment configuration information from 1.3 and 2.0
> select * from gp_segment_configuration ;
> 2. Could you please run "explain analyze" to get more statistic
> information?
> 3. Want to confirm with you: The result run in yarn mode ,right? Also I see
> your previous email to indicate there is some error in yarn, these query is
> also from that test round, right?
>
> Thanks
>
> Jiali
>
> On Fri, Nov 27, 2015 at 3:43 PM, Leon Zhang <[email protected]> wrote:
>
> > Hi, HAWQ Developers:
> >
> > As my previous email hint, I run TPC-DS test on our development.
> > Comparing with previous version 1.3.x, we can see the performance
> > improvement on most of queries.
> >
> > But the problem is performance reduction for *some* queries. For
> > example, the query64, the running time increase from 10754.688 ms
> > to 68884.731 ms . I am not sure if any changes were made that increase
> the
> > running time?
> >
> > In order to discuss the detail about this issue, I would like use the
> > query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> > also attache the sql about this query, and the query plan for this query.
> >
> > Thanks
> >
> >
>
Pager usage is off.
Timing is on.
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=750627231.96..750627235.38 rows=37 width=208)
Rows out: 5 rows with 1038177 ms to end, start offset by 251/251 ms.
-> Gather Motion 40:1 (slice10; segments: 40) (cost=750627231.96..750627235.38 rows=37 width=208)
Merge Key: partial_aggregation.cd_gender, partial_aggregation.cd_marital_status, partial_aggregation.cd_education_status, partial_aggregation.cd_purchase_estimate, partial_aggregation.cd_credit_rating, partial_aggregation.cd_dep_count, partial_aggregation.cd_dep_employed_count, partial_aggregation.cd_dep_college_count
Rows out: 5 rows at destination with 1038177 ms to end, start offset by 251/251 ms.
-> Limit (cost=750627231.96..750627234.64 rows=1 width=208)
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg13:dserver2/seg6:dserver1) 1/0 rows with 1038097/1038170 ms to end, start offset by 331/258 ms.
-> GroupAggregate (cost=750627231.96..750627234.64 rows=1 width=208)
Group By: customer_demographics.cd_gender, customer_demographics.cd_marital_status, customer_demographics.cd_education_status, customer_demographics.cd_purchase_estimate, customer_demographics.cd_credit_rating, customer_demographics.cd_dep_count, customer_demographics.cd_dep_employed_count, customer_demographics.cd_dep_college_count
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg13:dserver2/seg6:dserver1) 1/0 rows with 1038097/1038170 ms to end, start offset by 331/258 ms.
-> Sort (cost=750627231.96..750627232.05 rows=1 width=168)
Sort Key: customer_demographics.cd_gender, customer_demographics.cd_marital_status, customer_demographics.cd_education_status, customer_demographics.cd_purchase_estimate, customer_demographics.cd_credit_rating, customer_demographics.cd_dep_count, customer_demographics.cd_dep_employed_count, customer_demographics.cd_dep_college_count
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg13:dserver2/seg6:dserver1) 1/0 rows with 1038097/1038170 ms to end, start offset by 331/258 ms.
Executor memory: 74K bytes avg, 74K bytes max (seg0:dserver2).
Work_mem used: 74K bytes avg, 74K bytes max (seg0:dserver2). Workfile: (0 spilling, 0 reused)
-> Redistribute Motion 40:40 (slice9; segments: 40) (cost=750627227.90..750627230.99 rows=1 width=168)
Hash Key: customer_demographics.cd_gender, customer_demographics.cd_marital_status, customer_demographics.cd_education_status, customer_demographics.cd_purchase_estimate, customer_demographics.cd_credit_rating, customer_demographics.cd_dep_count, customer_demographics.cd_dep_employed_count, customer_demographics.cd_dep_college_count
Rows out: Avg 1.0 rows x 5 workers at destination. Max/Last(seg13:dserver2/seg6:dserver1) 1/0 rows with 1038094/1038169 ms to first row, 1038096/1038169 ms to end, start offset by 331/258 ms.
-> GroupAggregate (cost=750627227.90..750627230.25 rows=1 width=168)
Group By: customer_demographics.cd_gender, customer_demographics.cd_marital_status, customer_demographics.cd_education_status, customer_demographics.cd_purchase_estimate, customer_demographics.cd_credit_rating, customer_demographics.cd_dep_count, customer_demographics.cd_dep_employed_count, customer_demographics.cd_dep_college_count
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg6:dserver1/seg28:dserver4) 1/0 rows with 1038121/1038168 ms to end, start offset by 304/261 ms.
-> Sort (cost=750627227.90..750627228.00 rows=1 width=52)
Sort Key: customer_demographics.cd_gender, customer_demographics.cd_marital_status, customer_demographics.cd_education_status, customer_demographics.cd_purchase_estimate, customer_demographics.cd_credit_rating, customer_demographics.cd_dep_count, customer_demographics.cd_dep_employed_count, customer_demographics.cd_dep_college_count
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg6:dserver1/seg28:dserver4) 1/0 rows with 1038121/1038168 ms to end, start offset by 304/261 ms.
Executor memory: 74K bytes avg, 74K bytes max (seg0:dserver1).
Work_mem used: 74K bytes avg, 74K bytes max (seg0:dserver1). Workfile: (0 spilling, 0 reused)
-> Hash Join (cost=51824.81..750627226.91 rows=1 width=52)
Hash Cond: c.c_current_cdemo_sk = customer_demographics.cd_demo_sk
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg6:dserver1/seg28:dserver4) 1/0 rows with 1038118/1038167 ms to first row, 1038120/1038167 ms to end, start offset by 304/261 ms.
Executor memory: 4127K bytes avg, 4132K bytes max (seg31:dserver1).
Work_mem used: 4127K bytes avg, 4132K bytes max (seg31:dserver1). Workfile: (0 spilling, 0 reused)
(seg6) Hash chain length 5.9 avg, 15 max, using 8202 of 8219 buckets.
-> Redistribute Motion 40:40 (slice8; segments: 40) (cost=47434.01..750622835.54 rows=1 width=4)
Hash Key: c.c_current_cdemo_sk
Rows out: Avg 1.0 rows x 5 workers at destination. Max/Last(seg6:dserver1/seg28:dserver4) 1/0 rows with 1038064/1038113 ms to end, start offset by 359/311 ms.
-> Hash Join (cost=47434.01..750622834.75 rows=1 width=4)
Hash Cond: c.c_current_addr_sk = ca.ca_address_sk
Rows out: Avg 1.0 rows x 5 workers. Max/Last(seg4:dserver1/seg1:dserver1) 1/0 rows with 1038106/1038152 ms to first row, 1038107/1038152 ms to end, start offset by 315/270 ms.
Executor memory: 1K bytes avg, 1K bytes max (seg8:dserver1).
Work_mem used: 1K bytes avg, 1K bytes max (seg8:dserver1). Workfile: (0 spilling, 0 reused)
(seg8) Hash chain length 1.0 avg, 1 max, using 16 of 16417 buckets.
-> Redistribute Motion 40:40 (slice7; segments: 40) (cost=46537.06..750621923.50 rows=139 width=8)
Hash Key: c.c_current_addr_sk
Rows out: Avg 18.7 rows x 40 workers at destination. Max/Last(seg27:dserver4/seg24:dserver4) 29/14 rows with 872361/872373 ms to first row, 1038118/1038131 ms to end, start offset by 305/292 ms.
-> Hash EXISTS Join (cost=46537.06..750621813.01 rows=139 width=8)
Hash Cond: c.c_customer_sk = store_sales.ss_customer_sk
Rows out: Avg 18.7 rows x 40 workers. Max/Last(seg16:dserver3/seg29:dserver4) 33/19 rows with 1388/73467 ms to first row, 956689/1038137 ms to end, start offset by 294/286 ms.
Executor memory: 61K bytes avg, 71K bytes max (seg16:dserver3).
Work_mem used: 61K bytes avg, 71K bytes max (seg16:dserver3). Workfile: (0 spilling, 0 reused)
(seg16) Hash chain length 11.8 avg, 31 max, using 255 of 16417 buckets.
-> Parquet table Scan on customer c (cost=0.00..750575019.39 rows=1875 width=12)
Filter: ((subplan)) OR ((subplan))
Rows out: Avg 204.6 rows x 40 workers. Max/Last(seg16:dserver3/seg29:dserver4) 233/206 rows with 1286/8523 ms to first row, 956494/1037944 ms to end, start offset by 396/398 ms.
SubPlan 2
-> Hash Join (cost=1417.92..25303.88 rows=1 width=314)
Hash Cond: catalog_sales.cs_sold_date_sk = public.date_dim.d_date_sk
Rows out: Avg 146.8 rows x 40 workers. Max/Last(seg24:dserver4/seg29:dserver4) 168/142 rows with 2990/2937 ms to first row, 612184/680461 ms to end of 2434 scans, start offset by 936501/1038123 ms.
Executor memory: 19K bytes avg, 19K bytes max (seg0:dserver4).
Work_mem used: 19K bytes avg, 19K bytes max (seg0:dserver4). Workfile: (0 spilling, 0 reused)
(seg24) Hash chain length 1.0 avg, 1 max, using 121 of 4111 buckets.
-> Result (cost=23885.37..23885.61 rows=1 width=196)
Filter: $0 = catalog_sales.cs_ship_customer_sk
Rows out: Avg 32669.0 rows x 40 workers. Max/Last(seg39:dserver5/seg29:dserver4) 33781/33550 rows with 2647/2617 ms to first row, 596709/680042 ms to end of 2449 scans, start offset by 918407/1038123 ms.
-> Materialize (cost=23885.37..23885.61 rows=1 width=196)
Rows out: Avg 3398115438.8 rows x 40 workers. Max/Last(seg13:dserver2/seg29:dserver4) 3430771516/3400285162 rows with 2582/2615 ms to first row, 379423/433362 ms to end of 2454 scans, start offset by 941837/1038123 ms.
Work_mem used: 4384K bytes avg, 4384K bytes max (seg0:dserver2). Workfile: (40 spilling, 0 reused)
Work_mem wanted: 356880K bytes avg, 356896K bytes max (seg1:dserver2) to lessen workfile I/O affecting 40 workers.
-> Broadcast Motion 40:40 (slice3; segments: 40) (cost=0.00..23885.35 rows=1 width=196)
Rows out: Avg 1441548.0 rows x 40 workers at destination. Max/Last(seg0:dserver1/seg37:dserver5) 1441548/1441548 rows with 0.034/0.027 ms to first row, 1895/2026 ms to end, start offset by 1753/1754 ms.
-> Parquet table Scan on catalog_sales (cost=0.00..23885.35 rows=1 width=196)
Rows out: Avg 36038.7 rows x 40 workers. Max/Last(seg14:dserver2/seg2:dserver1) 36569/36445 rows with 43/52 ms to first row, 133/152 ms to end, start offset by 270/285 ms.
-> Hash (cost=1415.36..1415.36 rows=6 width=118)
Rows in: Avg 277755.5 rows x 40 workers. Max/Last(seg13:dserver2/seg21:dserver3) 281446/278058 rows with 0.309/0.307 ms to first row, 282/333 ms to end of 2326 scans, start offset by 941837/945426 ms.
-> Result (cost=1415.56..1417.61 rows=6 width=118)
Rows out: Avg 277755.5 rows x 40 workers. Max/Last(seg13:dserver2/seg21:dserver3) 281446/278058 rows with 0.175/0.148 ms to first row, 141/193 ms to end of 2326 scans, start offset by 941837/945426 ms.
-> Materialize (cost=1415.56..1417.61 rows=6 width=118)
Rows out: Avg 277755.5 rows x 40 workers. Max/Last(seg13:dserver2/seg20:dserver3) 281446/279026 rows with 0.172/0.151 ms to first row, 19/20 ms to end of 2326 scans, start offset by 941837/959432 ms.
-> Broadcast Motion 40:40 (slice4; segments: 40) (cost=0.00..1415.36 rows=6 width=118)
Rows out: Avg 121.0 rows x 40 workers at destination. Max/Last(seg0:dserver1/seg4:dserver1) 121/121 rows with 0.010/0.010 ms to first row, 0.124/0.144 ms to end, start offset by 1752/1838 ms.
-> Parquet table Scan on date_dim (cost=0.00..1415.36 rows=6 width=118)
Filter: d_year = 2000 AND d_moy >= 2 AND d_moy <= 5
Rows out: Avg 3.0 rows x 40 workers. Max/Last(seg15:dserver2/seg1:dserver1) 6/1 rows with 13/30 ms to first row, 14/31 ms to end, start offset by 338/290 ms.
SubPlan 1
-> Hash Join (cost=1417.92..13400.71 rows=1 width=315)
Hash Cond: web_sales.ws_sold_date_sk = public.date_dim.d_date_sk
Rows out: Avg 57.8 rows x 40 workers. Max/Last(seg16:dserver3/seg4:dserver1) 77/66 rows with 1273/1430 ms to first row, 334911/369183 ms to end of 2497 scans, start offset by 956550/1028295 ms.
Executor memory: 19K bytes avg, 19K bytes max (seg0:dserver3).
Work_mem used: 19K bytes avg, 19K bytes max (seg0:dserver3). Workfile: (0 spilling, 0 reused)
(seg16) Hash chain length 1.0 avg, 1 max, using 121 of 4111 buckets.
-> Result (cost=11982.32..11982.50 rows=1 width=197)
Filter: $0 = web_sales.ws_bill_customer_sk
Rows out: Avg 17114.4 rows x 40 workers. Max/Last(seg8:dserver2/seg4:dserver1) 18252/16501 rows with 1330/1323 ms to first row, 307714/368735 ms to end of 2500 scans, start offset by 887642/1028295 ms.
-> Materialize (cost=11982.32..11982.50 rows=1 width=197)
Rows out: Avg 1776055167.9 rows x 40 workers. Max/Last(seg35:dserver5/seg29:dserver4) 1786409923/1777435690 rows with 1187/1187 ms to first row, 195575/226737 ms to end of 2506 scans, start offset by 928289/1038012 ms.
Work_mem used: 4384K bytes avg, 4384K bytes max (seg0:dserver5). Workfile: (40 spilling, 0 reused)
Work_mem wanted: 178614K bytes avg, 178624K bytes max (seg0:dserver5) to lessen workfile I/O affecting 40 workers.
-> Broadcast Motion 40:40 (slice1; segments: 40) (cost=0.00..11982.30 rows=1 width=197)
Rows out: Avg 719384.0 rows x 40 workers at destination. Max/Last(seg0:dserver1/seg26:dserver4) 719384/719384 rows with 0.019/0.012 ms to first row, 798/940 ms to end, start offset by 408/410 ms.
-> Parquet table Scan on web_sales (cost=0.00..11982.30 rows=1 width=197)
Rows out: Avg 17984.6 rows x 40 workers. Max/Last(seg22:dserver3/seg6:dserver1) 18321/18027 rows with 24/46 ms to first row, 64/85 ms to end, start offset by 310/285 ms.
-> Hash (cost=1415.36..1415.36 rows=6 width=118)
Rows in: Avg 295506.2 rows x 40 workers. Max/Last(seg9:dserver2/seg0:dserver1) 298023/296087 rows with 0.560/0.330 ms to first row, 297/313 ms to end of 2463 scans, start offset by 919741/961573 ms.
-> Result (cost=1415.56..1417.61 rows=6 width=118)
Rows out: Avg 295506.2 rows x 40 workers. Max/Last(seg9:dserver2/seg0:dserver1) 298023/296087 rows with 0.343/0.188 ms to first row, 147/156 ms to end of 2463 scans, start offset by 919741/961574 ms.
-> Materialize (cost=1415.56..1417.61 rows=6 width=118)
Rows out: Avg 295506.2 rows x 40 workers. Max/Last(seg9:dserver2/seg6:dserver1) 298023/294272 rows with 0.338/0.245 ms to first row, 19/20 ms to end of 2463 scans, start offset by 919741/981215 ms.
-> Broadcast Motion 40:40 (slice2; segments: 40) (cost=0.00..1415.36 rows=6 width=118)
Rows out: Avg 121.0 rows x 40 workers at destination. Max/Last(seg0:dserver1/seg37:dserver5) 121/121 rows with 0.008/0.006 ms to first row, 0.100/0.176 ms to end, start offset by 408/407 ms.
-> Parquet table Scan on date_dim (cost=0.00..1415.36 rows=6 width=118)
Filter: d_year = 2000 AND d_moy >= 2 AND d_moy <= 5
Rows out: Avg 3.0 rows x 40 workers. Max/Last(seg15:dserver2/seg10:dserver2) 6/3 rows with 12/42 ms to first row, 13/43 ms to end, start offset by 319/321 ms.
-> Hash (cost=46440.57..46440.57 rows=193 width=4)
Rows in: Avg 2567.7 rows x 40 workers. Max/Last(seg16:dserver3/seg6:dserver1) 3021/2552 rows with 102/124 ms to end, start offset by 295/272 ms.
-> Redistribute Motion 40:40 (slice6; segments: 40) (cost=1602.08..46440.57 rows=193 width=4)
Hash Key: store_sales.ss_customer_sk
Rows out: Avg 2631.1 rows x 40 workers at destination. Max/Last(seg34:dserver5/seg6:dserver1) 5211/2552 rows with 54/97 ms to first row, 81/124 ms to end, start offset by 314/272 ms.
-> Hash EXISTS Join (cost=1602.08..46286.20 rows=193 width=4)
Hash Cond: store_sales.ss_sold_date_sk = public.date_dim.d_date_sk
Rows out: Avg 2631.1 rows x 40 workers. Max/Last(seg24:dserver4/seg25:dserver4) 2767/2635 rows with 76/85 ms to first row, 98/109 ms to end, start offset by 275/267 ms.
Executor memory: 3K bytes avg, 3K bytes max (seg0:dserver4).
Work_mem used: 3K bytes avg, 3K bytes max (seg0:dserver4). Workfile: (0 spilling, 0 reused)
(seg24) Hash chain length 1.0 avg, 1 max, using 121 of 16417 buckets.
-> Parquet table Scan on store_sales (cost=0.00..36634.04 rows=72011 width=8)
Rows out: Avg 72010.1 rows x 40 workers. Max/Last(seg25:dserver4/seg15:dserver2) 72634/72015 rows with 16/32 ms to first row, 33/52 ms to end, start offset by 336/334 ms.
-> Hash (cost=1499.49..1499.49 rows=206 width=4)
Rows in: Avg 121.0 rows x 40 workers. Max/Last(seg0:dserver1/seg25:dserver4) 121/121 rows with 56/68 ms to end, start offset by 278/267 ms.
-> Broadcast Motion 40:40 (slice5; segments: 40) (cost=0.00..1499.49 rows=206 width=4)
Rows out: Avg 121.0 rows x 40 workers at destination. Max/Last(seg0:dserver1/seg25:dserver4) 121/121 rows with 0.015/7.018 ms to first row, 56/68 ms to end, start offset by 278/267 ms.
-> Parquet table Scan on date_dim (cost=0.00..1415.36 rows=6 width=4)
Filter: d_year = 2000 AND d_moy >= 2 AND d_moy <= 5
Rows out: Avg 3.0 rows x 40 workers. Max/Last(seg15:dserver2/seg7:dserver1) 6/3 rows with 9.946/51 ms to first row, 10/52 ms to end, start offset by 303/264 ms.
-> Hash (cost=892.50..892.50 rows=9 width=4)
Rows in: Avg 9.6 rows x 40 workers. Max/Last(seg8:dserver2/seg1:dserver1) 16/6 rows with 13/54 ms to end, start offset by 339/271 ms.
-> Parquet table Scan on customer_address ca (cost=0.00..892.50 rows=9 width=4)
Filter: ca_county::text = ANY ('{"Yellowstone County","Montgomery County","Divide County","Cedar County","Manassas Park city"}'::text[])
Rows out: Avg 9.6 rows x 40 workers. Max/Last(seg8:dserver2/seg1:dserver1) 16/6 rows with 12/53 ms to first row, 13/54 ms to end, start offset by 339/271 ms.
-> Hash (cost=1989.80..1989.80 rows=4802 width=56)
Rows in: Avg 48020.0 rows x 40 workers. Max/Last(seg31:dserver4/seg37:dserver5) 48070/47967 rows with 60/80 ms to end, start offset by 280/277 ms.
-> Parquet table Scan on customer_demographics (cost=0.00..1989.80 rows=4802 width=56)
Rows out: Avg 48020.0 rows x 40 workers. Max/Last(seg31:dserver4/seg9:dserver2) 48070/48017 rows with 7.228/34 ms to first row, 31/49 ms to end, start offset by 280/330 ms.
Slice statistics:
(slice0) Executor memory: 2545K bytes.
(slice1) Executor memory: 11343K bytes avg x 40 workers, 11523K bytes max (seg22:dserver3).
(slice2) Executor memory: 1967K bytes avg x 40 workers, 1976K bytes max (seg9:dserver2).
(slice3) Executor memory: 15960K bytes avg x 40 workers, 15969K bytes max (seg35:dserver5).
(slice4) Executor memory: 1967K bytes avg x 40 workers, 1976K bytes max (seg9:dserver2).
(slice5) Executor memory: 1613K bytes avg x 40 workers, 1613K bytes max (seg9:dserver2).
(slice6) Executor memory: 3206K bytes avg x 40 workers, 3222K bytes max (seg8:dserver2). Work_mem: 3K bytes max.
(slice7) * Executor memory: 9648K bytes avg x 40 workers, 9722K bytes max (seg39:dserver5). Work_mem: 4384K bytes max, 356896K bytes wanted.
(slice8) Executor memory: 1878K bytes avg x 40 workers, 1878K bytes max (seg0:dserver1). Work_mem: 1K bytes max.
(slice9) Executor memory: 10339K bytes avg x 40 workers, 10342K bytes max (seg39:dserver5). Work_mem: 4132K bytes max.
(slice10) Executor memory: 1310K bytes avg x 40 workers, 1310K bytes max (seg0:dserver1). Work_mem: 74K bytes max.
Statement statistics:
Memory used: 131072K bytes
Memory wanted: 4371623K bytes
Dispatcher statistics:
executors used(total/cached/new connection): (400/0/400); dispatcher time(total/connection/dispatch data): (260.044 ms/244.493 ms/13.412 ms).
dispatch data time(max/min/avg): (5.006 ms/0.014 ms/0.140 ms); consume executor data time(max/min/avg): (0.046 ms/0.003 ms/0.012 ms); free executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms).
Data locality statistics:
data locality ratio: 1.000; virtual segment number: 40; different host number: 5; virtual segment number per host(avg/min/max): (8/8/8); segment size(avg/min/max): (14644087.900/14563078/14740584); segment size with penalty(avg/min/max): (0.000/0/0); continuity(avg/min/max): (1.000/1.000/1.000).
Total runtime: 1038428.664 ms
(161 rows)
Time: 1038546.753 ms