> Not wanting to be impatient, but I'm very concerned about this > problem, since its impossible to predict when it will occur. Is there > additional information that I can provide to investigate this further?
I really need to know where pgpool is looping. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp > thanks > > On Tue, Nov 22, 2011 at 10:11 AM, Lonni J Friedman <netll...@gmail.com> wrote: >> This hadn't reproduced in a long time, but we ugpraded to pgpool-3.1 a >> week ago, and this morning I found a pgpool process that was consuming >> 100% CPU, and had been running for a week (although wasn't consuming >> 100% CPU the entire time). Something else weird is that it showed an >> active, idle connection from a client system which had only been up >> for the past 21 hours. Anyway, here's the backtrace from the process >> (gdb hung at the very bottom): >> >> [root ~]# gdb pgpool 31293 >> GNU gdb (GDB) Red Hat Enterprise Linux (7.1-29.el6) >> Copyright (C) 2010 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. Type "show copying" >> and "show warranty" for details. >> This GDB was configured as "x86_64-redhat-linux-gnu". >> For bug reporting instructions, please see: >> <http://www.gnu.org/software/gdb/bugs/>... >> Reading symbols from /usr/sbin/pgpool...Reading symbols from >> /usr/lib/debug/usr/sbin/pgpool.debug...done. >> done. >> Attaching to program: /usr/sbin/pgpool, process 31293 >> Reading symbols from /usr/lib64/libpq.so.5...(no debugging symbols >> found)...done. >> Loaded symbols for /usr/lib64/libpq.so.5 >> Reading symbols from /usr/lib64/libpcp.so.0...Reading symbols from >> /usr/lib/debug/usr/lib64/libpcp.so.0.0.0.debug...done. >> done. >> Loaded symbols for /usr/lib64/libpcp.so.0 >> Reading symbols from /lib64/libpam.so.0...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libpam.so.0 >> Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols >> found)...done. >> Loaded symbols for /usr/lib64/libssl.so.10 >> Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging >> symbols found)...done. >> Loaded symbols for /usr/lib64/libcrypto.so.10 >> Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libcrypt.so.1 >> Reading symbols from /lib64/libresolv.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libresolv.so.2 >> Reading symbols from /lib64/libnsl.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libnsl.so.1 >> Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libm.so.6 >> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libc.so.6 >> Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging >> symbols found)...done. >> Loaded symbols for /lib64/libgssapi_krb5.so.2 >> Reading symbols from /usr/lib64/libldap_r-2.4.so.2...(no debugging >> symbols found)...done. >> Loaded symbols for /usr/lib64/libldap_r-2.4.so.2 >> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols >> found)...done. >> [Thread debugging using libthread_db enabled] >> Loaded symbols for /lib64/libpthread.so.0 >> Reading symbols from /lib64/libaudit.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libaudit.so.1 >> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libdl.so.2 >> Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libkrb5.so.3 >> Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libcom_err.so.2 >> Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libk5crypto.so.3 >> Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libz.so.1 >> Reading symbols from /usr/lib64/libfreebl3.so...(no debugging symbols >> found)...done. >> Loaded symbols for /usr/lib64/libfreebl3.so >> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging >> symbols found)...done. >> Loaded symbols for /lib64/ld-linux-x86-64.so.2 >> Reading symbols from /lib64/libkrb5support.so.0...(no debugging >> symbols found)...done. >> Loaded symbols for /lib64/libkrb5support.so.0 >> Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libkeyutils.so.1 >> Reading symbols from /usr/lib64/liblber-2.4.so.2...(no debugging >> symbols found)...done. >> Loaded symbols for /usr/lib64/liblber-2.4.so.2 >> Reading symbols from /usr/lib64/libsasl2.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /usr/lib64/libsasl2.so.2 >> Reading symbols from /lib64/libselinux.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libselinux.so.1 >> Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libnss_files.so.2 >> 0x00000000004192c0 in pool_process_query (frontend=0x2dd8fd0, >> backend=0x187d540, >> reset_request=<value optimized out>) at pool_process_query.c:379 >> 379 if >> (!pool_read_buffer_is_empty(MASTER(backend)) || >> pool_is_query_in_progress()) >> Missing separate debuginfos, use: debuginfo-install >> audit-libs-2.0.4-1.el6.x86_64 cyrus-sasl-lib-2.1.23-8.el6.x86_64 >> glibc-2.12-1.7.el6.x86_64 keyutils-libs-1.4-1.el6.x86_64 >> krb5-libs-1.8.2-3.el6.x86_64 libcom_err-1.41.12-3.el6.x86_64 >> libselinux-2.0.94-2.el6.x86_64 >> nss-softokn-freebl-3.12.7-1.1.el6.x86_64 openldap-2.4.19-15.el6.x86_64 >> openssl-1.0.0-4.el6.x86_64 pam-1.1.1-4.el6.x86_64 >> postgresql-libs-8.4.4-2.el6.x86_64 zlib-1.2.3-25.el6.x86_64 >> (gdb) bt >> #0 0x00000000004192c0 in pool_process_query (frontend=0x2dd8fd0, >> backend=0x187d540, >> reset_request=<value optimized out>) at pool_process_query.c:379 >> #1 0x000000000040ae42 in do_child (unix_fd=3, inet_fd=<value >> optimized out>) at child.c:354 >> #2 0x00000000004054c5 in fork_a_child (unix_fd=3, inet_fd=4, id=152) >> at main.c:1072 >> #3 0x0000000000407b1c in main (argc=<value optimized out>, >> argv=<value optimized out>) >> at main.c:549 >> (gdb) cont >> Continuing. >> >> On Tue, Sep 20, 2011 at 9:25 AM, Lonni J Friedman <netll...@gmail.com> wrote: >>> Nevermind, I figured out what I was doing wrong. Now I just need for >>> this to hang again. >>> >>> On Tue, Sep 20, 2011 at 9:08 AM, Lonni J Friedman <netll...@gmail.com> >>> wrote: >>>> I tried to do that, but pgpool refuses to start reporting: >>>> -bash: /usr/lib/debug/usr/sbin/pgpool.debug: bad ELF interpreter: No >>>> such file or directory >>>> >>>> I'm puzzled why it fails, as it was built on the same server where I >>>> built the (working) release build of pgpool. >>>> >>>> $ file /usr/lib/debug/usr/sbin/pgpool.debug >>>> /usr/lib/debug/usr/sbin/pgpool.debug: ELF 64-bit LSB executable, >>>> x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), >>>> for GNU/Linux 2.6.18, not stripped >>>> $ file /usr/sbin/pgpool >>>> /usr/sbin/pgpool: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), >>>> dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped >>>> >>>> >>>> >>>> >>>> On Mon, Sep 19, 2011 at 7:10 PM, Tatsuo Ishii <is...@sraoss.co.jp> wrote: >>>>> It's really hard to find the cause of the problem from a stack trace >>>>> without symbol tables... Is it possible to reinstalll pgpool binary >>>>> with debug symbols? >>>>> -- >>>>> Tatsuo Ishii >>>>> SRA OSS, Inc. Japan >>>>> English: http://www.sraoss.co.jp/index_en.php >>>>> Japanese: http://www.sraoss.co.jp >>>>> >>>>>> This happened again. I ran the gdb command that you requested, >>>>>> however it occurred to me that the output may not be all that useful >>>>>> since I'm not running a debug build of pgpool: >>>>>> ############### >>>>>> # gdb pgpool 2343 >>>>>> GNU gdb (GDB) Red Hat Enterprise Linux (7.1-29.el6) >>>>>> Copyright (C) 2010 Free Software Foundation, Inc. >>>>>> License GPLv3+: GNU GPL version 3 or later >>>>>> <http://gnu.org/licenses/gpl.html> >>>>>> This is free software: you are free to change and redistribute it. >>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show >>>>>> copying" >>>>>> and "show warranty" for details. >>>>>> This GDB was configured as "x86_64-redhat-linux-gnu". >>>>>> For bug reporting instructions, please see: >>>>>> <http://www.gnu.org/software/gdb/bugs/>... >>>>>> Reading symbols from /usr/sbin/pgpool...(no debugging symbols >>>>>> found)...done. >>>>>> Attaching to program: /usr/sbin/pgpool, process 2343 >>>>>> Reading symbols from /usr/lib64/libpq.so.5...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /usr/lib64/libpq.so.5 >>>>>> Reading symbols from /usr/lib64/libpcp.so.0...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /usr/lib64/libpcp.so.0 >>>>>> Reading symbols from /lib64/libpam.so.0...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libpam.so.0 >>>>>> Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /usr/lib64/libssl.so.10 >>>>>> Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging >>>>>> symbols found)...done. >>>>>> Loaded symbols for /usr/lib64/libcrypto.so.10 >>>>>> Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libcrypt.so.1 >>>>>> Reading symbols from /lib64/libresolv.so.2...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libresolv.so.2 >>>>>> Reading symbols from /lib64/libnsl.so.1...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libnsl.so.1 >>>>>> Reading symbols from /lib64/libm.so.6...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libm.so.6 >>>>>> Reading symbols from /lib64/libc.so.6...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libc.so.6 >>>>>> Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging >>>>>> symbols found)...done. >>>>>> Loaded symbols for /lib64/libgssapi_krb5.so.2 >>>>>> Reading symbols from /usr/lib64/libldap_r-2.4.so.2...(no debugging >>>>>> symbols found)...done. >>>>>> Loaded symbols for /usr/lib64/libldap_r-2.4.so.2 >>>>>> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols >>>>>> found)...done. >>>>>> [Thread debugging using libthread_db enabled] >>>>>> Loaded symbols for /lib64/libpthread.so.0 >>>>>> Reading symbols from /lib64/libaudit.so.1...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libaudit.so.1 >>>>>> Reading symbols from /lib64/libdl.so.2...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libdl.so.2 >>>>>> Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libkrb5.so.3 >>>>>> Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libcom_err.so.2 >>>>>> Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libk5crypto.so.3 >>>>>> Reading symbols from /lib64/libz.so.1...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libz.so.1 >>>>>> Reading symbols from /usr/lib64/libfreebl3.so...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /usr/lib64/libfreebl3.so >>>>>> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging >>>>>> symbols found)...done. >>>>>> Loaded symbols for /lib64/ld-linux-x86-64.so.2 >>>>>> Reading symbols from /lib64/libkrb5support.so.0...(no debugging >>>>>> symbols found)...done. >>>>>> Loaded symbols for /lib64/libkrb5support.so.0 >>>>>> Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libkeyutils.so.1 >>>>>> Reading symbols from /usr/lib64/liblber-2.4.so.2...(no debugging >>>>>> symbols found)...done. >>>>>> Loaded symbols for /usr/lib64/liblber-2.4.so.2 >>>>>> Reading symbols from /usr/lib64/libsasl2.so.2...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /usr/lib64/libsasl2.so.2 >>>>>> Reading symbols from /lib64/libselinux.so.1...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libselinux.so.1 >>>>>> Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libnss_files.so.2 >>>>>> Reading symbols from /lib64/libnss_dns.so.2...(no debugging symbols >>>>>> found)...done. >>>>>> Loaded symbols for /lib64/libnss_dns.so.2 >>>>>> 0x000000000044790a in ?? () >>>>>> Missing separate debuginfos, use: debuginfo-install >>>>>> pgpool-II-3.0.4-1.el6.x86_64 >>>>>> (gdb) bt >>>>>> #0 0x000000000044790a in ?? () >>>>>> #1 0x0000000000414547 in ?? () >>>>>> #2 0x000000000041762e in ?? () >>>>>> #3 0x000000000040a4cd in ?? () >>>>>> #4 0x0000000000405345 in ?? () >>>>>> #5 0x00000000004068dc in ?? () >>>>>> #6 0x00000000004076dc in ?? () >>>>>> #7 0x00000031ae41ec5d in __libc_start_main () from /lib64/libc.so.6 >>>>>> #8 0x0000000000403bf9 in ?? () >>>>>> #9 0x00007fff0663cfc8 in ?? () >>>>>> #10 0x000000000000001c in ?? () >>>>>> #11 0x0000000000000004 in ?? () >>>>>> #12 0x00007fff0663d90b in ?? () >>>>>> #13 0x00007fff0663dfe6 in ?? () >>>>>> #14 0x00007fff0663dfe6 in ?? () >>>>>> #15 0x00007fff0663dfe6 in ?? () >>>>>> #16 0x0000000000000000 in ?? () >>>>>> (gdb) cont >>>>>> Continuing. >>>>>> >>>>>> ############### >>>>>> >>>>>> The entire session completely hung at the end there. >>>>>> >>>>>> >>>>>> On Wed, Sep 14, 2011 at 3:57 PM, Tatsuo Ishii <is...@sraoss.co.jp> wrote: >>>>>>> Please use gdb. For example, >>>>>>> >>>>>>> become postgres user (or root user) >>>>>>> gdb pgpool 29191 >>>>>>> bt >>>>>>> cont >>>>>>> bt >>>>>>> cont >>>>>>> : >>>>>>> : >>>>>>> : >>>>>>> >>>>>>> This will give us an idea where it's looping. >>>>>>> -- >>>>>>> Tatsuo Ishii >>>>>>> SRA OSS, Inc. Japan >>>>>>> English: http://www.sraoss.co.jp/index_en.php >>>>>>> Japanese: http://www.sraoss.co.jp >>>>>>> >>>>>>>> This problem has returned yet again: >>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>>>>>> 29191 postgres 20 0 80192 14m 1544 R 89.8 0.2 51:15.91 pgpool >>>>>>>> >>>>>>>> postgres 29191 3.4 0.1 80192 14728 ? R Sep13 51:40 >>>>>>>> pgpool: lfriedman nightly 10.31.96.84(61698) idle >>>>>>>> >>>>>>>> >>>>>>>> I'd really appreciate some input on how to debug this. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netll...@gmail.com> >>>>>>>> wrote: >>>>>>>>> No one else has experienced this or has suggestions how to debug it? >>>>>>>>> >>>>>>>>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman >>>>>>>>> <netll...@gmail.com> wrote: >>>>>>>>>> Greetings, >>>>>>>>>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load >>>>>>>>>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2 >>>>>>>>>> standby). I'm seeing strange behavior where a single pgpool process >>>>>>>>>> seems to hang after some period of time, and then consume 100% of the >>>>>>>>>> CPU. I've seen this behavior happen twice since last Friday (when >>>>>>>>>> pgpool was brought online in my production environment). At the >>>>>>>>>> moment the current hung process looks like this in 'ps auxww' output: >>>>>>>>>> >>>>>>>>>> postgres 19838 98.7 0.0 68856 2904 ? R Sep06 1027:36 >>>>>>>>>> pgpool: lfriedman nightly 10.31.45.20(58277) idle >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In top, I see: >>>>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>>>>>>>> 19838 postgres 20 0 68856 2904 1072 R 100.0 0.0 1027:29 pgpool >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> When to connect to the process with strace, there is no output, so >>>>>>>>>> I'm >>>>>>>>>> guessing the process is stuck spinning somewhere: >>>>>>>>>> # strace -p 19838 >>>>>>>>>> Process 19838 attached - interrupt to quit >>>>>>>>>> ... >>>>>>>>>> ^CProcess 19838 detached >>>>>>>>>> >>>>>>>>>> One thing that i'm certain of is that the client IP (10.31.45.20) >>>>>>>>>> associated with the hung process has rebooted at least once since >>>>>>>>>> that >>>>>>>>>> process was spawned. So pgpool seems to be in some confused state, >>>>>>>>>> as >>>>>>>>>> the client definitely severed the connection already. I checked the >>>>>>>>>> pgpool log and there are no explicit references to PID 19838. I'm at >>>>>>>>>> a loss how to debug this further, but clearly something is wrong >>>>>>>>>> somewhere, and this isn't normal/expected behavior. _______________________________________________ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general