One suggestion would be to make *write *non blocking. Any other suggestions?
On Sat, Jul 25, 2020 at 4:43 AM Amarjeet Singh <[email protected]> wrote: > Hi Team, > > > More analysis on this :- > > There are threads which are in deadlock: > > THREAD *25377 *is waiting for the mutex lock whereas THREAD *25376 *is > stuck in a *write *system call Because of which there are connections > which are in CLOSE_WAIT. > guacd is not able to free the resources as well. > > (gdb) info threads > Id Target Id Frame > 7 Thread 0x7fb3431ce700 (LWP 25374) "guacd" 0x00007fb7ad8fcf57 in > pthread_join () from /lib64/libpthread.so.0 > * 6 Thread 0x7fb441bcb700 (LWP 25376) "guacd" 0x00007fb7ad9026ad in write > () from /lib64/libpthread.so.0 > 5 Thread 0x7fb4423cc700 (LWP 25377) "guacd" 0x00007fb7ad90242d in > __lll_lock_wait () from /lib64/libpthread.so.0 > 4 Thread 0x7fb3439cf700 (LWP 25395) "guacd" 0x00007fb7ac1ed7a3 in > select () from /lib64/libc.so.6 > 3 Thread 0x7fb3441d0700 (LWP 25396) "guacd" 0x00007fb7ac1ed7a3 in > select () from /lib64/libc.so.6 > 2 Thread 0x7fb3449d1700 (LWP 25397) "guacd" 0x00007fb7ac1ed7a3 in > select () from /lib64/libc.so.6 > 1 Thread 0x7fb3429cd700 (LWP 23724) "guacd" 0x00007fb7ad902b5d in > recvmsg () from /lib64/libpthread.so.0 > (gdb) thr 5 > [Switching to thread 5 (Thread 0x7fb4423cc700 (LWP 25377))] > #0 0x00007fb7ad90242d in __lll_lock_wait () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007fb7ad90242d in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007fb7ad8fddcb in _L_lock_812 () from /lib64/libpthread.so.0 > #2 0x00007fb7ad8fdc98 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00007fb7ae4c5345 in guac_socket_fd_write_handler () from > /lib64/libguac.so.17 > #4 0x00007fb7ae4c4733 in __guac_socket_write () from /lib64/libguac.so.17 > #5 0x00007fb7ae4c4770 in guac_socket_write () from /lib64/libguac.so.17 > #6 0x00007fb7ae4c4a9a in guac_socket_write_string () from > /lib64/libguac.so.17 > #7 0x00007fb7ae4c2365 in guac_protocol_send_error () from > /lib64/libguac.so.17 > #8 0x00007fb7ae4c63cf in vguac_user_abort () from /lib64/libguac.so.17 > #9 0x00007fb7ae4c6495 in guac_user_abort () from /lib64/libguac.so.17 > #10 0x00007fb7ae4c7aa8 in guac_user_input_thread () from > /lib64/libguac.so.17 > #11 0x00007fb7ad8fbe25 in start_thread () from /lib64/libpthread.so.0 > #12 0x00007fb7ac1f634d in clone () from /lib64/libc.so.6 > > *MUTEX IS OWNED BY 25376* > >> 2 0x00007fb7ad8fdc98 in pthread_mutex_lock () from /lib64/libpthread.so.0 >> (gdb) info reg >> rax 0xfffffffffffffe00 -512 >> rbx 0x0 0 >> rcx 0xffffffffffffffff -1 >> rdx 0x0 0 >> rsi 0x0 0 >> rdi 0x7fb7a001dc30 140426640219184 >> rbp 0x7fb4423cba00 0x7fb4423cba00 >> rsp 0x7fb4423cb9c8 0x7fb4423cb9c8 >> r8 0x7fb7a001dc30 140426640219184 >> r9 0x141d54 1318228 >> r10 0x2 2 >> r11 0x202 514 >> r12 0x0 0 >> r13 0x7fb4423cc9c0 140412182120896 >> r14 0x7fb4423cc700 140412182120192 >> r15 0x2a 42 >> rip 0x7fb7ad8fdc98 0x7fb7ad8fdc98 <pthread_mutex_lock+104> >> eflags 0x202 [ IF ] >> cs 0x33 51 >> ss 0x2b 43 >> ds 0x0 0 >> es 0x0 0 >> fs 0x0 0 >> gs 0x0 0 >> (gdb) print *((int*)(0x7fb7a001dc30)+2) >> $6 = *25376* > > > *STRACE of the THREAD is as follows : -* > > strace -p 25376 >> Process 25376 attached >> write(4, "4.sync,10.1318124283;", 21 > > > > Can I file a bug in JIRA ? > > Any suggestions how to fix the above ? > > *NOTE *: This happens intermittently. > > Thanks and Regards, > Amarjeet Singh > > > > On Fri, Jul 17, 2020 at 8:43 AM Amarjeet Singh <[email protected]> > wrote: > >> Hi Team, >> >> *GUACD *is consuming 100% of RAM. On analysis I have found that there >> are many process which are not in any state [ CLOSE_WAIT, ESTABLISHED etc ] >> but they are in >> recvmsg waiting for the fd. This process is there for more than 2 days. >> Below is the backtrace of the process. >> >> Reading symbols from /usr/lib64/freerdp/disp.so...Reading symbols from >>> /usr/lib64/freerdp/disp.so...(no debugging symbols found)...done. >>> (no debugging symbols found)...done. >>> Loaded symbols for /usr/lib64/freerdp/disp.so >>> 0x00007fa764807b5d in recvmsg () from /lib64/libpthread.so.0 >>> Missing separate debuginfos, use: debuginfo-install >>> accops-server-8.0.0-2.x86_64 >>> (gdb) bt >>> #0 0x00007fa764807b5d in recvmsg () from /lib64/libpthread.so.0 >>> #1 0x0000000000404a64 in guacd_recv_fd () >>> #2 0x0000000000404ed9 in guacd_exec_proc () >>> #3 0x0000000000405297 in guacd_create_proc () >>> #4 0x000000000040399f in guacd_route_connection () >>> #5 0x0000000000403ba7 in guacd_connection_thread () >>> #6 0x00007fa764800e25 in start_thread () from /lib64/libpthread.so.0 >>> #7 0x00007fa7630fb34d in clone () from /lib64/libc.so.6 >> >> >> Please help me to understand what is going wrong here ? This is not >> happening for every connections. Is there any way we can fix this ? >> There are many connections which are in CLOSE_WAIT ( parent process id ) >> also. They are there for many days. >> >> Amarjeet Singh >> >
