I have a cluster with TPC-DS data and all data is loaded from external tables. hawq_rm_nvseg_perquery_perseg_limit was set to the default of 6 when the data was loaded and I have 10 nodes so the bucketnum = 60 for all tables. All tables are also randomly distributed.
In my efforts to optimize performance, I've tried increasing hawq_rm_nvseg_perquery_perseg_limit to utilize more resources to execute the queries. Unfortunately, this hasn't helped any. I then noticed this: *"Random tables are copied from files: #vseg is a fixed value. #vseg is 6, when there are sufficient resources."* http://hdb.docs.pivotal.io/201/hawq/query/query-performance.html#topic38 It appears that tables loaded from external tables have a fixed number of vsegs but tables loaded from other internal tables have a dynamic number. For example: --table loaded from an external table gpadmin=# explain analyze select count(*) from tpcds.date_dim; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------- Aggregate (cost=0.00..431.09 rows=1 width=8) Rows out: Avg 1.0 rows x 1 workers. Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-21-4-229.ec2.internal) 1/1 rows with 16277/16277 ms to end, start off set by 3255/3255 ms. -> Gather Motion 60:1 (slice1; segments: 60) (cost=0.00..431.09 rows=1 width=8) Rows out: Avg 60.0 rows x 1 workers at destination. Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-21-4-229.ec2.internal) 60/60 rows with 39/39 ms to first row, 16277/16277 ms to end, start offset by 3255/3255 ms. -> Aggregate (cost=0.00..431.09 rows=1 width=8) Rows out: Avg 1.0 rows x 60 workers. Max/Last(seg59:ip-172-21-4-235.ec2.internal/seg11:ip-172-21-4-226.ec2.internal) 1/1 rows with 107/16274 ms to end , start offset by 3257/3257 ms. -> Table Scan on date_dim (cost=0.00..431.09 rows=1218 width=1) Rows out: Avg 1217.5 rows x 60 workers. Max/Last(seg51:ip-172-21-4-234.ec2.internal/seg11:ip-172-21-4-226.ec2.internal) 1242/1232 rows with 5035 /16273 ms to end, start offset by 3258/3257 ms. Slice statistics: (slice0) Executor memory: 412K bytes. (slice1) Executor memory: 215K bytes avg x 60 workers, 215K bytes max (seg59:ip-172-21-4-235.ec2.internal). Statement statistics: Memory used: 262144K bytes Settings: default_hash_table_bucket_number=60; optimizer=on Optimizer status: PQO version 1.638 Dispatcher statistics: executors used(total/cached/new connection): (60/1/59); dispatcher time(total/connection/dispatch data): (3254.181 ms/1480539781639.241 ms/0.483 ms). dispatch data time(max/min/avg): (0.021 ms/0.005 ms/0.007 ms); consume executor data time(max/min/avg): (0.044 ms/0.003 ms/0.009 ms); free executor time(max/min/avg ): (0.000 ms/0.000 ms/0.000 ms). Data locality statistics: data locality ratio: 1.000; virtual segment number: 60; different host number: 10; virtual segment number per host(avg/min/max): (6/6/6); segment size(avg/min/max): (186145.950 B/182662 B/189757 B); segment size with penalty(avg/min/max): (0.000 B/0 B/0 B); continuity(avg/min/max): (1.000/1.000/1.000); DFS metadatacache: 0.263 ms ; resource allocation: 0.521 ms; datalocality calculation: 0.114 ms. Total runtime: 19549.798 ms (21 rows) Time: 19651.159 ms This is just a small dimension table too. I then loaded that data from the local table to a new random table. gpadmin=# explain analyze select count(*) from tpcds_opt.date_dim; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- Aggregate (cost=0.00..436.36 rows=1 width=8) Rows out: Avg 1.0 rows x 1 workers. Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-21-4-229.ec2.internal) 1/1 rows with 2624/2624 ms to end, start offse t by 1.081/1.081 ms. -> Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..436.36 rows=1 width=8) Rows out: Avg 1.0 rows x 1 workers at destination. Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-21-4-229.ec2.internal) 1/1 rows with 2624/2624 m s to end, start offset by 1.082/1.082 ms. -> Aggregate (cost=0.00..436.36 rows=1 width=8) Rows out: Avg 1.0 rows x 1 workers. Max/Last(seg0:ip-172-21-4-225.ec2.internal/seg0:ip-172-21-4-225.ec2.internal) 1/1 rows with 2621/2621 ms to end, s tart offset by 0.846/0.846 ms. -> Table Scan on date_dim (cost=0.00..436.22 rows=73049 width=1) Rows out: Avg 73049.0 rows x 1 workers. Max/Last(seg0:ip-172-21-4-225.ec2.internal/seg0:ip-172-21-4-225.ec2.internal) 73049/73049 rows with 2.96 6/2.966 ms to first row, 2595/2595 ms to end, start offset by 0.847/0.847 ms. Slice statistics: (slice0) Executor memory: 170K bytes. (slice1) Executor memory: 343K bytes (seg0:ip-172-21-4-225.ec2.internal). Statement statistics: Memory used: 262144K bytes Settings: default_hash_table_bucket_number=60; optimizer=on Optimizer status: PQO version 1.638 Dispatcher statistics: executors used(total/cached/new connection): (1/1/0); dispatcher time(total/connection/dispatch data): (0.122 ms/0.000 ms/0.015 ms). dispatch data time(max/min/avg): (0.015 ms/0.015 ms/0.015 ms); consume executor data time(max/min/avg): (0.011 ms/0.011 ms/0.011 ms); free executor time(max/min/avg ): (0.000 ms/0.000 ms/0.000 ms). Data locality statistics: data locality ratio: 0.296; virtual segment number: 1; different host number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max): ( 11168757.000 B/11168757 B/11168757 B); segment size with penalty(avg/min/max): (11247341.000 B/11247341 B/11247341 B); continuity(avg/min/max): (1.000/1.000/1.000); DF S metadatacache: 0.254 ms; resource allocation: 0.387 ms; datalocality calculation: 0.130 ms. Total runtime: 2627.711 ms (21 rows) Time: 2728.409 ms I'm able to decrease the query execution time of many queries by increasing hawq_rm_nvseg_perquery_perseg_limit but only for the tables loaded from other local tables and not tables loaded from external tables. Based on the documentation, this appears to be the expected behavior. - Are there plans to correct this? - What happens if the cluster expands? Will these random need to be redistributed? - Are there workarounds to this issue? Jon Roberts
