I'm happy to report that I made an error! I thought I had set my scripts to use random distribution for the tpcds schema but I used hash. That is why it wasn't behaving as I had expected.
Sorry for the noise. Jon Roberts Principal Engineer | [email protected] | 615-426-8661 On Thu, Dec 1, 2016 at 8:53 PM, Hubert Zhang <[email protected]> wrote: > Hi Jon, > 1 Your problem: the vseg number for your two queries ("explain analyze > select count(*) from tpcds.date_dim;" > "explain analyze select count(*) from tpcds_opt.date_dim;" ) depend on the > distribution type of your table and table size. > Could you give us more details about your tables by running "select * from > gp_distribution_policy where localoid=idofYourTable;" > > 2 Questions about cluster expanding. > For new created hash tables, the bucket number depends on the current > value of default_hash_table_bucket_number (GUC), which is recommended to > be > changed after cluster expanding. So no side effect on new created hash > table. > For old hash tables, the bucket number is the old value of > default_hash_table_bucket_number. > After cluster expanding , for example 16 nodes to 64 nodes, the old value > of default_hash_table_bucket_number is 16*6=96, the new value of > default_hash_table_bucket_number > is 64*6=384. Query such as "select count(*) from old_hash_table" will only > use 96 vsegs instead of 384 vsegs. One way is to reload the old hash table, > The other way is to use random table at the beginning.(This is why we use > random table as default in HAWQ2.x compared with hash table as default in > HAWQ1.x) > > For random table, the vseg number is determined by the table size. For > small tables(such as data size is less than one hdfs block size), only one > vseg will be used, and for big tables, the upper bound of vsegs number is > hawq_rm_nvseg_perquery_perseg_limit * #segment. > Cluster expanding will increase the number of segment(for example 16 nodes > to 64 nodes, and then queries on a random table will use 4X times number of > vsegs). > > Thanks > Hubert > > > On Thu, Dec 1, 2016 at 5:16 AM, Jon Roberts <[email protected]> wrote: > > > I have a cluster with TPC-DS data and all data is loaded from external > > tables. hawq_rm_nvseg_perquery_perseg_limit was set to the default of 6 > > when the data was loaded and I have 10 nodes so the bucketnum = 60 for > all > > tables. All tables are also randomly distributed. > > > > In my efforts to optimize performance, I've tried > > increasing hawq_rm_nvseg_perquery_perseg_limit to utilize more resources > > to > > execute the queries. Unfortunately, this hasn't helped any. > > > > I then noticed this: > > > > *"Random tables are copied from files: #vseg is a fixed value. #vseg is > 6, > > when there are sufficient resources."* > > http://hdb.docs.pivotal.io/201/hawq/query/query-performance.html#topic38 > > > > It appears that tables loaded from external tables have a fixed number of > > vsegs but tables loaded from other internal tables have a dynamic number. > > For example: > > > > --table loaded from an external table > > gpadmin=# explain analyze select count(*) from tpcds.date_dim; > > > > > > > > QUERY PLAN > > > > > > > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ----------------------------------------------- > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ----------------------------------------------- > > --------------------------------------------------------------------- > > Aggregate (cost=0.00..431.09 rows=1 width=8) > > Rows out: Avg 1.0 rows x 1 workers. > > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172- > > 21-4-229.ec2.internal) > > 1/1 rows with 16277/16277 ms to end, start off > > set by 3255/3255 ms. > > -> Gather Motion 60:1 (slice1; segments: 60) (cost=0.00..431.09 > > rows=1 width=8) > > Rows out: Avg 60.0 rows x 1 workers at destination. > > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172- > > 21-4-229.ec2.internal) > > 60/60 rows with 39/39 ms > > to first row, 16277/16277 ms to end, start offset by 3255/3255 ms. > > -> Aggregate (cost=0.00..431.09 rows=1 width=8) > > Rows out: Avg 1.0 rows x 60 workers. > > Max/Last(seg59:ip-172-21-4-235.ec2.internal/seg11:ip-172- > > 21-4-226.ec2.internal) > > 1/1 rows with 107/16274 ms to end > > , start offset by 3257/3257 ms. > > -> Table Scan on date_dim (cost=0.00..431.09 rows=1218 > > width=1) > > Rows out: Avg 1217.5 rows x 60 workers. > > Max/Last(seg51:ip-172-21-4-234.ec2.internal/seg11:ip-172- > > 21-4-226.ec2.internal) > > 1242/1232 rows with 5035 > > /16273 ms to end, start offset by 3258/3257 ms. > > Slice statistics: > > (slice0) Executor memory: 412K bytes. > > (slice1) Executor memory: 215K bytes avg x 60 workers, 215K bytes > max > > (seg59:ip-172-21-4-235.ec2.internal). > > Statement statistics: > > Memory used: 262144K bytes > > Settings: default_hash_table_bucket_number=60; optimizer=on > > Optimizer status: PQO version 1.638 > > Dispatcher statistics: > > executors used(total/cached/new connection): (60/1/59); dispatcher > > time(total/connection/dispatch data): (3254.181 ms/1480539781639.241 > > ms/0.483 ms). > > dispatch data time(max/min/avg): (0.021 ms/0.005 ms/0.007 ms); consume > > executor data time(max/min/avg): (0.044 ms/0.003 ms/0.009 ms); free > > executor time(max/min/avg > > ): (0.000 ms/0.000 ms/0.000 ms). > > Data locality statistics: > > data locality ratio: 1.000; virtual segment number: 60; different host > > number: 10; virtual segment number per host(avg/min/max): (6/6/6); > segment > > size(avg/min/max): > > (186145.950 B/182662 B/189757 B); segment size with > penalty(avg/min/max): > > (0.000 B/0 B/0 B); continuity(avg/min/max): (1.000/1.000/1.000); DFS > > metadatacache: 0.263 ms > > ; resource allocation: 0.521 ms; datalocality calculation: 0.114 ms. > > Total runtime: 19549.798 ms > > (21 rows) > > > > Time: 19651.159 ms > > > > This is just a small dimension table too. > > > > I then loaded that data from the local table to a new random table. > > > > gpadmin=# explain analyze select count(*) from tpcds_opt.date_dim; > > > > > > > > QUERY PLAN > > > > > > > > > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ----------------------------------------------- > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ----------------------------------------------- > > ------------------------------------------------------------ > > ---------------------------------- > > Aggregate (cost=0.00..436.36 rows=1 width=8) > > Rows out: Avg 1.0 rows x 1 workers. > > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172- > > 21-4-229.ec2.internal) > > 1/1 rows with 2624/2624 ms to end, start offse > > t by 1.081/1.081 ms. > > -> Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..436.36 > rows=1 > > width=8) > > Rows out: Avg 1.0 rows x 1 workers at destination. > > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172- > > 21-4-229.ec2.internal) > > 1/1 rows with 2624/2624 m > > s to end, start offset by 1.082/1.082 ms. > > -> Aggregate (cost=0.00..436.36 rows=1 width=8) > > Rows out: Avg 1.0 rows x 1 workers. > > Max/Last(seg0:ip-172-21-4-225.ec2.internal/seg0:ip-172-21-4- > > 225.ec2.internal) > > 1/1 rows with 2621/2621 ms to end, s > > tart offset by 0.846/0.846 ms. > > -> Table Scan on date_dim (cost=0.00..436.22 rows=73049 > > width=1) > > Rows out: Avg 73049.0 rows x 1 workers. > > Max/Last(seg0:ip-172-21-4-225.ec2.internal/seg0:ip-172-21-4- > > 225.ec2.internal) > > 73049/73049 rows with 2.96 > > 6/2.966 ms to first row, 2595/2595 ms to end, start offset by 0.847/0.847 > > ms. > > Slice statistics: > > (slice0) Executor memory: 170K bytes. > > (slice1) Executor memory: 343K bytes > > (seg0:ip-172-21-4-225.ec2.internal). > > Statement statistics: > > Memory used: 262144K bytes > > Settings: default_hash_table_bucket_number=60; optimizer=on > > Optimizer status: PQO version 1.638 > > Dispatcher statistics: > > executors used(total/cached/new connection): (1/1/0); dispatcher > > time(total/connection/dispatch data): (0.122 ms/0.000 ms/0.015 ms). > > dispatch data time(max/min/avg): (0.015 ms/0.015 ms/0.015 ms); consume > > executor data time(max/min/avg): (0.011 ms/0.011 ms/0.011 ms); free > > executor time(max/min/avg > > ): (0.000 ms/0.000 ms/0.000 ms). > > Data locality statistics: > > data locality ratio: 0.296; virtual segment number: 1; different host > > number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment > > size(avg/min/max): ( > > 11168757.000 B/11168757 B/11168757 B); segment size with > > penalty(avg/min/max): (11247341.000 B/11247341 B/11247341 B); > > continuity(avg/min/max): (1.000/1.000/1.000); DF > > S metadatacache: 0.254 ms; resource allocation: 0.387 ms; datalocality > > calculation: 0.130 ms. > > Total runtime: 2627.711 ms > > (21 rows) > > > > Time: 2728.409 ms > > > > I'm able to decrease the query execution time of many queries by > > increasing hawq_rm_nvseg_perquery_perseg_limit but only for the tables > > loaded from other local tables and not tables loaded from external > tables. > > Based on the documentation, this appears to be the expected behavior. > > > > - Are there plans to correct this? > > - What happens if the cluster expands? Will these random need to be > > redistributed? > > - Are there workarounds to this issue? > > > > Jon Roberts > > > > > > -- > Thanks > > Hubert Zhang >
