So, after upgrading to Guacamole 1.2.0 (both client and server), I still have the same issue under heavy load (here "heavy" means > 300 connected clients, but we would like to support at least 2000 cuncurrent clients).

Increasing the guacd log level to "debug" I see that failed connections have this error:

   *guacd[73533]: DEBUG:#011failed to create thread pipe fd 0*

but I'm unable to find the cause. Is this an operating system limit or a file system limit?

Any idea?

Thank you in advance,

Gianluca

Il 25/11/2020 10:04, Gianluca Renzi ha scritto:
Il 24/11/2020 17:44, Nick Couchman ha scritto:

The fact that this happens during "Peak Hours" suggest that something may be failing from a resource perspective. The hardware you have seems robust enough, though you haven't said what your actual connection load is (100 concurrent users, 1000, 10000, etc.)? Outside of the hardware itself, have you looked at things like PAN firewall load, Internet circuit, and network links?

Hi Nick,

  thank you very much for your reply.

From a performance perspective, I started monitoring server and network load and both are not very high. Now we have about 250 users connected (or trying to connect), and this is the situation:

    *root@guacamole2:/var/log# uptime*
     09:53:51 up 168 days, 19:26,  1 user,  load average: 4,91, 5,26, 5,48

    *root@guacamole2:/var/log# lscpu *
    Architecture:        x86_64
    CPU op-mode(s):      32-bit, 64-bit
    Byte Order:          Little Endian
    Address sizes:       43 bits physical, 48 bits virtual
    CPU(s):              32
    On-line CPU(s) list: 0-31
    Thread(s) per core:  2
    Core(s) per socket:  16
    Socket(s):           1
    NUMA node(s):        4
    Vendor ID:           AuthenticAMD
    CPU family:          23
    Model:               1
    Model name:          AMD EPYC 7281 16-Core Processor
    Stepping:            2
    CPU MHz:             2682.467
    BogoMIPS:            4192.08
    Virtualization:      AMD-V
    L1d cache:           32K
    L1i cache:           64K
    L2 cache:            512K
    L3 cache:            4096K
    NUMA node0 CPU(s):   0,4,8,12,16,20,24,28
    NUMA node1 CPU(s):   1,5,9,13,17,21,25,29
    NUMA node2 CPU(s):   2,6,10,14,18,22,26,30
    NUMA node3 CPU(s):   3,7,11,15,19,23,27,31

    *root@guacamole2:/var/log# free -h*
                  total        used        free      shared
    buff/cache   available
    Mem:           62Gi        34Gi        14Gi 675Mi       
    13Gi        26Gi

    Swap: 14Gi          0B        14Gi

    *root@guacamole2:/var/log# systemctl status guacd*
    ● guacd.service - Guacamole Server
       Loaded: loaded (/etc/systemd/system/guacd.service; disabled;
    vendor preset: enabled)
       Active: active (running) since Tue 2020-06-09 16:52:21 CEST; 5
    months 16 days ago
         Docs: man:guacd(8)
     Main PID: 26032 (guacd)
        Tasks: 13053 (limit: 19660)
       Memory: 29.5G
       CGroup: /system.slice/guacd.service
               ├─   330 /usr/local/sbin/guacd -f
               ├─   451 /usr/local/sbin/guacd -f
               ├─   588 /usr/local/sbin/guacd -f

    .....


On the firewall side, since it is a high end appliance, it is running at 3-4% of its capabilities.

One of the problems I can see from the server logs is a recurring segfault of the guacd service:

    Nov 25 09:49:20 guacamole2 kernel: [14584906.960505] guacd[6601]:
    segfault at 20 ip 00007f46002f0eb5 sp 00007f401a58bb38 error 6 in
    libwinpr2.so.2.0.0[7f46002e0000+6b000]
    Nov 25 09:49:20 guacamole2 kernel: [14584906.960515] Code: 00 00
    00 00 00 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f b7 47
    08 c3 66 66 2e 0f 1f 84 00 00 00 00 00 b8 01 00 00 00 <f0> 0f c1
    07 83 c0 01 c3 0f 1f 00 b8 ff ff ff ff f0 0f c1 07 83 e8

I thought it was due to an older version of the libwinpr2, but the upgrade to the latest available version in the Debian (Buster) repository did not solve. Maybe I should change some kernel parameters?

I have an identical server (same hardware, same operating system) running Guacamole 1.2 and it doesn't show similar problems. I'll try to upgrade to Guacamole 1.2 to see if it solves the problem.

Kind regards,

Gianluca




        User "xxx" connected to connection "xxxDesktop"
        User "xxx" disconnected from connection "xxxDesktop".
        Duration: 73 milliseconds


    I'm trying to track the connection flow, but I'm unable to bind
    the catalina.out logs with the tomcat access logs and the guacd logs.
    If I search for "xxxDesktop" in the tomcat access logs, I can see
    entries like this:

        GET
        
/guacamole-1.1.0/api/session/data/ldap/connections/xxxDesktop?token=<LONG_HEX_STRING>


    and if I then search for that log token string I can find some 404:

        GET
        
/guacamole-1.1.0/api/session/tunnels/*8ac12e84-b219-45fb-8912-e2fdc702c870*/activeConnection/connection/sharingProfiles?token=
        <LONG_HEX_STRING> HTTP/1.1" 404 210


This isn't necessarily an error - this states that Guacamole Client is looking for Share Profiles on an active connection related to that specific tunnel, and that none exist. In this case, the 404 is just how the API tells the web application that there are no Sharing Profiles for that particular tunnel.


    But I'm not able to bind this kind of log entry with any entry in
    the guacd log file, thus I can't go back to the root cause of the
    disconnection. Here I'm assuming that there is some mapping
    between the tunnel ID (the string in the url after "tunnels" and
    the guacd logs, but maybe I'm wrong.

    In the guacd log I see entries like this:

        nov 23 11:37:35 guacamole2 guacd[33384]: Last user of
        connection "$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98" disconnected
        nov 23 11:37:35 guacamole2 guacd[26032]: guacd[33384]: INFO:
               User "@ee949374-6f5a-4f89-bd17-4b8931d3fdb5"
        disconnected (0 users remain)
        nov 23 11:37:35 guacamole2 guacd[26032]: guacd[33384]: INFO:
               Last user of connection
        "$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98" disconnected
        nov 23 11:37:35 guacamole2 guacd[26032]: Connection
        "$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98" removed.
        nov 23 11:37:35 guacamole2 guacd[26032]: guacd[26032]: INFO:
               Connection "$10c70db2-8d84-4b1c-aca4-28ed4f9e3a98"
        removed.


    but no way to map those user-id and connection-id to the actual
    connection in the Tomcat logs.


    Any suggestion?


My first suggestion would be to bump up the verbosity on both guacd and Guacamole Client such that you're getting more detailed error logs.  Instructions for doing this can be found at the following locations:

http://guacamole.apache.org/doc/gug/configuring-guacamole.html#webapp-logging <http://guacamole.apache.org/doc/gug/configuring-guacamole.html#webapp-logging> http://guacamole.apache.org/doc/gug/configuring-guacamole.html#guacd.conf <http://guacamole.apache.org/doc/gug/configuring-guacamole.html#guacd.conf>

That should give you some more information as to why connections are closing down, and might even provide the information you need to link the connection information in guacd to that in Tomcat. There is a JIRA issue out there for improvements that better help to correlate this information, but hasn't gotten any attention, yet:

https://issues.apache.org/jira/browse/GUACAMOLE-752 <https://issues.apache.org/jira/browse/GUACAMOLE-752>

-Nick


    Thank you in advance,
    Gianluca

-

Gianluca Renzi / Network engineer
+39.334.2312307/ [email protected] <mailto:[email protected]>

N3tCom Sas
Office: +39.0775.1855155 / Fax: +39.0775.1850188
via F. Brighindi, 26
03100 Frosinone
http://www.n3tcom.com <http://www.n3tcom.com>

This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited.

--

Gianluca Renzi / Network engineer
+39.334.2312307/ [email protected] <mailto:[email protected]>

N3tCom Sas
Office: +39.0775.1855155 / Fax: +39.0775.1850188
via F. Brighindi, 26
03100 Frosinone
http://www.n3tcom.com <http://www.n3tcom.com>

This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited.

Reply via email to