Hello Community,
We are currently load-testing a high-concurrency environment using
Guacamole and are seeking some advice on performance tuning. Our goal is to
support around 300 simultaneous RDP sessions on a single server.
*Environment:*
- *Server:* 16-core CPU, 128 GB RAM
- *Protocol:* RDP
- *Target Load:* 300 concurrent sessions
*Problem Statement:*
During our load tests, we've observed that as the number of RDP sessions
increases, the CPU load on the server becomes a significant bottleneck.
- At *300 concurrent sessions*, the total CPU utilization reaches *85%*.
- At the same time, RAM utilization is only at *20%*.
This strongly suggests that we are compute-bound, not memory-bound. The
primary consumers of CPU appear to be the individual guacd processes.
*Our Investigation & Analysis:*
Our investigation led us to the threading model within guacd, specifically
in guacamole-server/src/libguac/display.c. It appears that for each
connection, guacd spawns a pool of worker threads for encoding graphical
updates, with the number of threads being equal to the number of CPU cores
on the host (guac_display_nproc()).
On our 16-core server, this leads to an explosion of threads: 300
connections * 16 threads/connection = 4800 threads
We believe this is causing severe thread contention and context-switching
overhead, leading to the high CPU usage we're observing.
*Optimization planning:*
To address this, planning to modify *guacamole-server/src/libguac/display.c* to
limit the number of worker threads per connection to a small, fixed number,
like so:
/*
* For high-density servers, creating cpu_count threads per connection
* process can lead to excessive context switching. We'll limit the
* number of worker threads to a more conservative number. A value of
* 1 or 2 is generally sufficient.
*/
*display->worker_thread_count = 2;*
This change seems to be the most logical step to reduce the thread
thrashing, do you agree?
*Our Questions for the Community:*
1. Is our analysis of the CPU bottleneck due to the default threading
model correct for a high-concurrency environment?
2. Is the code modification shown above the recommended approach for
scaling guacd to hundreds of sessions?
3. Are there other known best practices, configuration changes (either
in guacd or on the RDP server side, like color depth), or architectural
optimizations we should consider to achieve our target of 300+ sessions?
We appreciate any insights or guidance you can provide. Thank you for your
time and for developing such a great tool.
Best regards,
-Dilip
--
This communication (including any attachments) is intended for the sole
use of the intended recipient and may contain confidential, non-public,
and/or privileged material. Use, distribution, or reproduction of this
communication by unintended recipients is not authorized. If you received
this communication in error, please immediately notify the sender and then
delete all copies of this communication from your system.