So, after upgrading to Guacamole 1.2.0 (both client and server), I still
have the same issue under heavy load (here "heavy" means > 300 connected
clients, but we would like to support at least 2000 cuncurrent clients).
Increasing the guacd log level to "debug" I see that failed connections
have this error:
*guacd[73533]: DEBUG:#011failed to create thread pipe fd 0*
but I'm unable to find the cause. Is this an operating system limit or a
file system limit?
Any idea?
Thank you in advance,
Gianluca
Il 25/11/2020 10:04, Gianluca Renzi ha scritto:
Il 24/11/2020 17:44, Nick Couchman ha scritto:
The fact that this happens during "Peak Hours" suggest that something
may be failing from a resource perspective. The hardware you have
seems robust enough, though you haven't said what your actual
connection load is (100 concurrent users, 1000, 10000, etc.)? Outside
of the hardware itself, have you looked at things like PAN firewall
load, Internet circuit, and network links?
Hi Nick,
thank you very much for your reply.
From a performance perspective, I started monitoring server and
network load and both are not very high. Now we have about 250 users
connected (or trying to connect), and this is the situation:
*root@guacamole2:/var/log# uptime*
09:53:51 up 168 days, 19:26, 1 user, load average: 4,91, 5,26, 5,48
*root@guacamole2:/var/log# lscpu *
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 4
Vendor ID: AuthenticAMD
CPU family: 23
Model: 1
Model name: AMD EPYC 7281 16-Core Processor
Stepping: 2
CPU MHz: 2682.467
BogoMIPS: 4192.08
Virtualization: AMD-V
L1d cache: 32K
L1i cache: 64K
L2 cache: 512K
L3 cache: 4096K
NUMA node0 CPU(s): 0,4,8,12,16,20,24,28
NUMA node1 CPU(s): 1,5,9,13,17,21,25,29
NUMA node2 CPU(s): 2,6,10,14,18,22,26,30
NUMA node3 CPU(s): 3,7,11,15,19,23,27,31
*root@guacamole2:/var/log# free -h*
total used free shared
buff/cache available
Mem: 62Gi 34Gi 14Gi 675Mi
13Gi 26Gi
Swap: 14Gi 0B 14Gi
*root@guacamole2:/var/log# systemctl status guacd*
● guacd.service - Guacamole Server
Loaded: loaded (/etc/systemd/system/guacd.service; disabled;
vendor preset: enabled)
Active: active (running) since Tue 2020-06-09 16:52:21 CEST; 5
months 16 days ago
Docs: man:guacd(8)
Main PID: 26032 (guacd)
Tasks: 13053 (limit: 19660)
Memory: 29.5G
CGroup: /system.slice/guacd.service
├─ 330 /usr/local/sbin/guacd -f
├─ 451 /usr/local/sbin/guacd -f
├─ 588 /usr/local/sbin/guacd -f
.....
On the firewall side, since it is a high end appliance, it is running
at 3-4% of its capabilities.
One of the problems I can see from the server logs is a recurring
segfault of the guacd service:
Nov 25 09:49:20 guacamole2 kernel: [14584906.960505] guacd[6601]:
segfault at 20 ip 00007f46002f0eb5 sp 00007f401a58bb38 error 6 in
libwinpr2.so.2.0.0[7f46002e0000+6b000]
Nov 25 09:49:20 guacamole2 kernel: [14584906.960515] Code: 00 00
00 00 00 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f b7 47
08 c3 66 66 2e 0f 1f 84 00 00 00 00 00 b8 01 00 00 00 <f0> 0f c1
07 83 c0 01 c3 0f 1f 00 b8 ff ff ff ff f0 0f c1 07 83 e8
I thought it was due to an older version of the libwinpr2, but the
upgrade to the latest available version in the Debian (Buster)
repository did not solve. Maybe I should change some kernel parameters?
I have an identical server (same hardware, same operating system)
running Guacamole 1.2 and it doesn't show similar problems. I'll try
to upgrade to Guacamole 1.2 to see if it solves the problem.
Kind regards,
Gianluca
User "xxx" connected to connection "xxxDesktop"
User "xxx" disconnected from connection "xxxDesktop".
Duration: 73 milliseconds
I'm trying to track the connection flow, but I'm unable to bind
the catalina.out logs with the tomcat access logs and the guacd logs.
If I search for "xxxDesktop" in the tomcat access logs, I can see
entries like this:
GET
/guacamole-1.1.0/api/session/data/ldap/connections/xxxDesktop?token=<LONG_HEX_STRING>
and if I then search for that log token string I can find some 404:
GET
/guacamole-1.1.0/api/session/tunnels/*8ac12e84-b219-45fb-8912-e2fdc702c870*/activeConnection/connection/sharingProfiles?token=
<LONG_HEX_STRING> HTTP/1.1" 404 210
This isn't necessarily an error - this states that Guacamole Client
is looking for Share Profiles on an active connection related to that
specific tunnel, and that none exist. In this case, the 404 is just
how the API tells the web application that there are no Sharing
Profiles for that particular tunnel.
But I'm not able to bind this kind of log entry with any entry in
the guacd log file, thus I can't go back to the root cause of the
disconnection. Here I'm assuming that there is some mapping
between the tunnel ID (the string in the url after "tunnels" and
the guacd logs, but maybe I'm wrong.
In the guacd log I see entries like this:
nov 23 11:37:35 guacamole2 guacd[33384]: Last user of
connection "$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98" disconnected
nov 23 11:37:35 guacamole2 guacd[26032]: guacd[33384]: INFO:
User "@ee949374-6f5a-4f89-bd17-4b8931d3fdb5"
disconnected (0 users remain)
nov 23 11:37:35 guacamole2 guacd[26032]: guacd[33384]: INFO:
Last user of connection
"$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98" disconnected
nov 23 11:37:35 guacamole2 guacd[26032]: Connection
"$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98" removed.
nov 23 11:37:35 guacamole2 guacd[26032]: guacd[26032]: INFO:
Connection "$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98"
removed.
but no way to map those user-id and connection-id to the actual
connection in the Tomcat logs.
Any suggestion?
My first suggestion would be to bump up the verbosity on both guacd
and Guacamole Client such that you're getting more detailed error
logs. Instructions for doing this can be found at the following
locations:
http://guacamole.apache.org/doc/gug/configuring-guacamole.html#webapp-logging
<http://guacamole.apache.org/doc/gug/configuring-guacamole.html#webapp-logging>
http://guacamole.apache.org/doc/gug/configuring-guacamole.html#guacd.conf
<http://guacamole.apache.org/doc/gug/configuring-guacamole.html#guacd.conf>
That should give you some more information as to why connections are
closing down, and might even provide the information you need to link
the connection information in guacd to that in Tomcat. There is a
JIRA issue out there for improvements that better help to correlate
this information, but hasn't gotten any attention, yet:
https://issues.apache.org/jira/browse/GUACAMOLE-752
<https://issues.apache.org/jira/browse/GUACAMOLE-752>
-Nick
Thank you in advance,
Gianluca
-
Gianluca Renzi / Network engineer
+39.334.2312307/ [email protected] <mailto:[email protected]>
N3tCom Sas
Office: +39.0775.1855155 / Fax: +39.0775.1850188
via F. Brighindi, 26
03100 Frosinone
http://www.n3tcom.com <http://www.n3tcom.com>
This e-mail message may contain confidential or legally privileged
information and is intended only for the use of the intended
recipient(s). Any unauthorized disclosure, dissemination,
distribution, copying or the taking of any action in reliance on the
information herein is prohibited.
--
Gianluca Renzi / Network engineer
+39.334.2312307/ [email protected] <mailto:[email protected]>
N3tCom Sas
Office: +39.0775.1855155 / Fax: +39.0775.1850188
via F. Brighindi, 26
03100 Frosinone
http://www.n3tcom.com <http://www.n3tcom.com>
This e-mail message may contain confidential or legally privileged
information and is intended only for the use of the intended
recipient(s). Any unauthorized disclosure, dissemination, distribution,
copying or the taking of any action in reliance on the information
herein is prohibited.