Hi All, I've spent a lot of time trying to figure out how to control the number of segments instances/threads per node in Hawq and could find any information on that. In fact I'm a bit confused: 1)in this blog entry: http://0x0fff.com/spark-dataframes-are-faster-arent-they/#more-268
I found that Alexey had a cluster of 4 nodes with 10threads running on each node. If I query the same table (gp_segment_configuration) in my installation I get only 5 rows - one per each node - this might indicate that I have only one segment instance running per node. 2) On the other hand when I monitor cpu utilization while running some test queries I can observe that actually 8 threads per node are active. I can also see that all my tables have 40 segments at maximum. I found some pieces of information on 2 params: NSegs The number of segment instances to run per segment host. gp_vmem_protect_limit The amount of memory allowed to a single segment instance on a host. But when I tried to put them in posgresql.conf of my nodes I found in log that both are unrecognized: 2016-02-21 20:16:36.720257 GMT,,,p20790,th-670398080,,,,0,,,seg-10000,,,,,"LOG","42704","unrecognized configuration parameter ""gp_vmem_protect_limit""",,,,,,,,"set_config_option","guc.c",9933, 2016-02-21 20:16:36.720913 GMT,,,p20790,th-670398080,,,,0,,,seg-10000,,,,,"LOG","42704","unrecognized configuration parameter ""NSegs""",,,,,,,,"set_config_option","guc.c",9933, So my question is how to: 1)Control number of threads/segment instances per node? 2)Control number of segments per table? I suspect that these 2 things might be somehow interconnected. Many thanks for any hints on that. Marek
