|
hawq_rm_stmt_vseg_memory and hawq_rm_nvseg_perquery_perseg_limit Inbox x
Jon Roberts via hawq.incubator.apache.org Jan 21 (10 days ago)
to dev Why is there a limit of 16GB for hawq_rm_stmt_vseg_memory? A cluster with 256GB per node and dedicated for HAWQ may certainly want to utilize more memory per segment. Is there something I'm missing regarding statement memory?
Secondly, does the number of vsegs for a query get influenced by the statement memory or does it just look at the plan and hawq_rm_nvseg_perquery_perseg_limit?
Lei Chang <[email protected]> Jan 21 (10 days ago)
to dev hawq_rm_stmt_vseg_memory and hawq_rm_stmt_nvseg need to be used together to set the specific number of segments and the vseg memory. And hawq_rm_stmt_nvseg should be less than hawq_rm_nvseg_perquery_perseg_limit.
set hawq_rm_stmt_vseg_memory = '2GB';set hawq_rm_stmt_nvseg = 6;
looks 16GB is somewhat small for big dedicated machines: if 16GB is per virtual segment memory, if 8 segment is used, it only use 128GB.
Cheers
Yi Jin <[email protected]> Jan 23 (8 days ago)
to dev Hi Jon,
That guc setting limit means for one virtual segment, the maximum consumable memory is 16GB, for one segment/node, there maybe multiple vsegs allocated to run queries, so if a node has 256gb expected to be consumed by HAWQ, it will have at most 16 vsegs running concurrently.
hawq_rm_stmt_vseg_memory is for setting statement level vseg memory consumption, it is required to specify hawq_rm_stmt_nvseg as well, only when hawq_rm_stmt_nvseg is greater than 0, hawq_rm_stmt_vseg_memory is activated regardless the original target resource queue vseg resource quota definition. For example, you can set hawq_rm_stmt_vseg_memory as 16gb, hawq_rm_stmt_nvseg as 256, if you have a cluster having 256gb * 16 nodes and your target resource queue can use 100% cluster resource, you will have 16 vsegs running per node to consume all memory resource for this query.
Best, Yi
Jon Roberts via hawq.incubator.apache.org Jan 23 (8 days ago)
to dev I've been thinking about these scenarios:
1. Hash distributed tables with fixed number of buckets. If the tables were built using the defaults, buckets = 6 * number of nodes. So you basically have 6 vsegs per host. Multiply that by 16GB and you only can use 96GB of the 256GB of RAM per node.
2. A user has random tables but doesn't understand they can increase the number of vsegs. This will be common for users that come from Greenplum. They again can only set statement member to 16GB so they are stuck with a max of 96GB of RAM usage.
3. User increases vsegs and statement memory. Possibly run out of memory if too aggressive with settings.
-
I think we should be able to specify statement memory higher than 16GB.
Maybe the limit should be something much higher such as 1TB.
-
The optimizer should limit the number of vsegs based on statement memory
setting to prevent OOM. You could do the opposite too. (limit memory and use the vseg setting provided) Greenplum can limit the amount of memory but we have two dials to adjust with vsegs and memory.
Yi Jin <[email protected]> Jan 23 (8 days ago)
to dev Hi Jon,
For scenario 1, it is true. However, do you think it necessary to make one vseg having so high memory quota setting? Of course, I think if we force hawq_rm_stmt_nvseg a value different with bucket number, optimizer will give different plan. This is way to force high concurrency and high resource occupation.
For scenario 2, I think if we want to release more concurrency in executing one query, we should advice customer to increase hawq_rm_nvseg_perquery_limit (default 512) and hawq_rm_nvseg_perquery_perseg_limit(default 6), then we choose an appropriate vseg mem quota, as a result HAWQ will make decision to leverage more concurrency and more resource for big queries automatically. I mean HAWQ will choose the number of virtual segments automatically, not always high number of virtual segments.
For scenario 3, It is recommended not to assign all system memory to HAWQ in the segment side. It is a good idea to leave some for system and for the other applications.
I think it is fine to support higher memory limit for one virtual segment, but still I have no idea why it is necessary.
As HAWQ uses different memory allocation strategy comparing with Greenplum, it is true, the external behavior of HAWQ is different. Virtual segment (VSEG) is a critical concept in HAWQ to manage resource consumption and query QE concurrency, in fact for Greenplum, the QE concurrency is fixed, i.e. the number of QEs for one query execution is fixed, this is why Greenplum user just needs to think of memory consumption in one segment, as all segments consume the same amount of memory for the same query. While back to HAWQ, this number is dynamic, you can easily change resource queue definition or statement resource definition to change it, you don't have to install and config more segments in those nodes to force the QE concurrency.
Consider a scenario that there are a lot of small scale queries concurrently arrived, HAWQ just allocate a small number of vsegs for each query without resource waste and this makes HAWQ able to support high concurrency much better than traditional MPP-architecture database to achieve high throughput. Opposite, for those large scale queries, HAWQ can automatically increase number of vseg to a very high number if you set those limit gucs high as well.
Best, Yi
Jon Roberts via hawq.incubator.apache.org Jan 24 (7 days ago)
to dev It is common in Greenplum to use "explain analyze" to look at the query plan and if the query spills to disk, then increasing the statement memory will typically increase the speed of the query. With hash distributed tables in Hawq, this will be true too and the only way, short of rewriting the query, to make it perform better.
For the really large Greenplum clusters out there, we don't see benefits from running 8 segments per host. Instead, we dial this back to 4 segments because the network traffic in the interconnect becomes the bottleneck. For Hawq where the clusters will likely be much bigger, I think we will have the same network limitations. So with random distribution, we will probably reduce the number of vsegs to get better performance on the network side and then use more memory per vseg to prevent spilling to disk and use more of the available resources. That is why we need the ability to allocate more than 16GB of RAM per vseg.
On a different but related note, I would like to see hawq_rm_nvseg_perquery_limit be changed to not determine the number of vsegs per query across the entire cluster but at the datanode level. That would make it much easier to expand and shrink a cluster without having to adjust GUCs each time.
Yi Jin <[email protected]> 3:56 PM (18 hours ago)
to dev Jon, Sorry for late response, I find I missed this one.
With regard to the memory quota of one vseg, I think I know your consideration now. It makes sense. What's your expectation for the memory limit? 32GB or 64GB? I think I can open a jira to track this and make decision which limit we can extend to.
And for hawq_rm_nvseg_perquery_limit, I think if you dont want this limits your query resource in a large cluster ,just set it to a large value 65535 for example, then I think you can achieve your goal, saying using hawq_rm_nvseg_perquery_perseg_limit to dynamically limit the number of vsegs for one query.
Yi Jin <[email protected]> 10:23 AM (2 minutes ago)
to dev I created https://issues.apache.org/jira/browse/HAWQ-1299 to track this, When have an agreement about the limit, I will fix it at once.
Please go there and comment for this improvement. Thank you.
|