Please use gdb. For example, become postgres user (or root user) gdb pgpool 29191 bt cont bt cont : : :
This will give us an idea where it's looping. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp > This problem has returned yet again: > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 29191 postgres 20 0 80192 14m 1544 R 89.8 0.2 51:15.91 pgpool > > postgres 29191 3.4 0.1 80192 14728 ? R Sep13 51:40 > pgpool: lfriedman nightly 10.31.96.84(61698) idle > > > I'd really appreciate some input on how to debug this. > > > On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netll...@gmail.com> wrote: >> No one else has experienced this or has suggestions how to debug it? >> >> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman <netll...@gmail.com> wrote: >>> Greetings, >>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load >>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2 >>> standby). I'm seeing strange behavior where a single pgpool process >>> seems to hang after some period of time, and then consume 100% of the >>> CPU. I've seen this behavior happen twice since last Friday (when >>> pgpool was brought online in my production environment). At the >>> moment the current hung process looks like this in 'ps auxww' output: >>> >>> postgres 19838 98.7 0.0 68856 2904 ? R Sep06 1027:36 >>> pgpool: lfriedman nightly 10.31.45.20(58277) idle >>> >>> >>> In top, I see: >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 19838 postgres 20 0 68856 2904 1072 R 100.0 0.0 1027:29 pgpool >>> >>> >>> When to connect to the process with strace, there is no output, so I'm >>> guessing the process is stuck spinning somewhere: >>> # strace -p 19838 >>> Process 19838 attached - interrupt to quit >>> ... >>> ^CProcess 19838 detached >>> >>> One thing that i'm certain of is that the client IP (10.31.45.20) >>> associated with the hung process has rebooted at least once since that >>> process was spawned. So pgpool seems to be in some confused state, as >>> the client definitely severed the connection already. I checked the >>> pgpool log and there are no explicit references to PID 19838. I'm at >>> a loss how to debug this further, but clearly something is wrong >>> somewhere, and this isn't normal/expected behavior. > _______________________________________________ > Pgpool-general mailing list > Pgpool-general@pgfoundry.org > http://pgfoundry.org/mailman/listinfo/pgpool-general _______________________________________________ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general