Re: [GENERAL] swarm of processes in BIND state?

2016-05-31 Thread hubert depesz lubaczewski
On Mon, May 30, 2016 at 11:05:17AM -0700, Jeff Janes wrote: > So my theory is that you deleted a huge number of entries off from > either end of the index, that transaction committed, and that commit > became visible to all. Planning a mergejoin needs to dig through all > those tuples to probe the

Re: [GENERAL] swarm of processes in BIND state?

2016-05-30 Thread Jeff Janes
On Sat, May 28, 2016 at 11:32 AM, hubert depesz lubaczewski wrote: > On Sat, May 28, 2016 at 10:32:15AM -0700, Jeff Janes wrote: >> If that wasn't informative, I'd attach to one of the processes with >> the gdb debugger and get a backtrace. (You might want to do that a >> few times, just in case

Re: [GENERAL] swarm of processes in BIND state?

2016-05-28 Thread hubert depesz lubaczewski
On Sat, May 28, 2016 at 02:15:07PM -0400, Tom Lane wrote: > hubert depesz lubaczewski writes: > > Does that help us in any way? > > Not terribly. That confirms that the processes are contending for a > spinlock, but we can't tell which one. Can you collect a few stack traces > from those proces

Re: [GENERAL] swarm of processes in BIND state?

2016-05-28 Thread hubert depesz lubaczewski
On Sat, May 28, 2016 at 10:32:15AM -0700, Jeff Janes wrote: > > any clues on where to start diagnosing it? > > I'd start by using strace (with -y -ttt -T) on one of the processes > and see what it is doing. A lot of IO, and one what file? A lot of > semop's? So, I did: sudo strace -o bad.log -

Re: [GENERAL] swarm of processes in BIND state?

2016-05-28 Thread Tom Lane
hubert depesz lubaczewski writes: > Does that help us in any way? Not terribly. That confirms that the processes are contending for a spinlock, but we can't tell which one. Can you collect a few stack traces from those processes? regards, tom lane -- Sent via pgsql-g

Re: [GENERAL] swarm of processes in BIND state?

2016-05-28 Thread hubert depesz lubaczewski
On Sat, May 28, 2016 at 08:04:43AM +0200, Pavel Stehule wrote: > > > you should to install debug info - or compile with dubug symbols > > Installed debug info, and the problem stopped. OK. ot he problem back. Ps looked like this: USERPID %CPU %MEMVSZ RSS TTY STAT START TIME

Re: [GENERAL] swarm of processes in BIND state?

2016-05-28 Thread Jeff Janes
On Fri, May 27, 2016 at 10:19 PM, hubert depesz lubaczewski wrote: > hi, > we have following situation: > pg 9.3.11 on ubuntu. > we have master and slave. > the db is large-ish, but we're removing *most* of its data from all > across the tables, and lots of tables too. > > while we're doing it, so

Re: [GENERAL] swarm of processes in BIND state?

2016-05-27 Thread hubert depesz lubaczewski
On Sat, May 28, 2016 at 07:46:52AM +0200, Pavel Stehule wrote: > you should to install debug info - or compile with dubug symbols Installed debug info, and the problem stopped. Don't think it's related - it could be just timing. I'll report back if/when the problem will re-appear. Best regards,

Re: [GENERAL] swarm of processes in BIND state?

2016-05-27 Thread Pavel Stehule
2016-05-28 7:45 GMT+02:00 hubert depesz lubaczewski : > On Sat, May 28, 2016 at 07:25:18AM +0200, Pavel Stehule wrote: > > It is looking like spinlock issue. > > try to look there by "perf top" > > First results look like: > > Samples: 64K of event 'cpu-clock', Event count (approx.): 2394094576 >

Re: [GENERAL] swarm of processes in BIND state?

2016-05-27 Thread hubert depesz lubaczewski
On Sat, May 28, 2016 at 07:25:18AM +0200, Pavel Stehule wrote: > It is looking like spinlock issue. > try to look there by "perf top" First results look like: Samples: 64K of event 'cpu-clock', Event count (approx.): 2394094576

Re: [GENERAL] swarm of processes in BIND state?

2016-05-27 Thread Pavel Stehule
Hi 2016-05-28 7:19 GMT+02:00 hubert depesz lubaczewski : > hi, > we have following situation: > pg 9.3.11 on ubuntu. > we have master and slave. > the db is large-ish, but we're removing *most* of its data from all > across the tables, and lots of tables too. > > while we're doing it, sometimes,

[GENERAL] swarm of processes in BIND state?

2016-05-27 Thread hubert depesz lubaczewski
hi, we have following situation: pg 9.3.11 on ubuntu. we have master and slave. the db is large-ish, but we're removing *most* of its data from all across the tables, and lots of tables too. while we're doing it, sometimes, we get LOTS of processes, but only on slave, never on master, that spend l