Simon Riggs writes:
> On 11 October 2017 at 08:09, Christophe Pettus wrote:
>> While it's certainly true that this was an extreme case, it was a real-life
>> production situation. The concern here is that in the actual production
>> situation, the
On 11 October 2017 at 08:09, Christophe Pettus wrote:
>
>> On Oct 10, 2017, at 23:54, Simon Riggs wrote:
>>
>> The use case described seems incredibly
>> unreal and certainly amenable to being rewritten.
>
> While it's certainly true that this was an
> On Oct 10, 2017, at 23:54, Simon Riggs wrote:
>
> The use case described seems incredibly
> unreal and certainly amenable to being rewritten.
While it's certainly true that this was an extreme case, it was a real-life
production situation. The concern here is that in
On 10 October 2017 at 21:23, Tom Lane wrote:
> What I see is that, given this particular test case, the backend
> process on the master never holds more than a few locks at a time.
> Each time we abort a subtransaction, the AE lock it was holding
> on the temp table it
Christophe Pettus writes:
> I was able to reproduce this on 9.5.9 with the following:
Hmm ... so I still can't reproduce the specific symptoms Christophe
reports.
What I see is that, given this particular test case, the backend
process on the master never holds more than a
> On Oct 10, 2017, at 08:05, Tom Lane wrote:
>
> You're right, I was testing on HEAD, so that patch might've obscured
> the problem. But the code looks like it could still be O(N^2) in
> some cases. Will look again later.
I was able to reproduce this on 9.5.9 with the
Alvaro Herrera writes:
> Tom Lane wrote:
>> Hmm, I tried to reproduce this and could not. I experimented with
>> various permutations of this:
> This problem is probably related to commit 9b013dc238c, which AFAICS is
> only in pg10, not 9.5.
You're right, I was testing
Tom Lane wrote:
> Christophe Pettus writes:
> > The problem indeed appear to be a very large number of subtransactions,
> > each one creating a temp table, inside a single transaction. It's made
> > worse by one of those transactions finally getting replayed on the
> >
> On Oct 9, 2017, at 17:30, Tom Lane wrote:
>
> What am I missing to reproduce the problem?
Not sure. The actual client behavior here is a bit cryptic (not our code,
incompletely logs). They might be creating a savepoint before each temp table
creation, without a
> On Oct 9, 2017, at 18:21, Peter Geoghegan wrote:
> What's the hot_standy_feedback setting? How about
> max_standby_archive_delay/max_standby_streaming_delay?
On, 5m, 5m.
--
-- Christophe Pettus
x...@thebuild.com
--
Sent via pgsql-general mailing list
On Mon, Oct 9, 2017 at 12:08 PM, Christophe Pettus wrote:
> Suggestions on further diagnosis?
What's the hot_standy_feedback setting? How about
max_standby_archive_delay/max_standby_streaming_delay?
--
Peter Geoghegan
--
Sent via pgsql-general mailing list
Peter Geoghegan writes:
> Just a guess, but do you disable autovacuum on your dev machine? (I know I
> do.)
Nope.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
On Mon, Oct 9, 2017 at 5:30 PM, Tom Lane wrote:
> and did not see any untoward behavior, at least not till I got to enough
> temp tables to overrun the master's shared lock table, and even then it
> cleaned up fine. At no point was the standby process consuming anywhere
>
Christophe Pettus writes:
> The problem indeed appear to be a very large number of subtransactions, each
> one creating a temp table, inside a single transaction. It's made worse by
> one of those transactions finally getting replayed on the secondary, only to
> have
On Mon, Oct 9, 2017 at 6:12 PM, Christophe Pettus wrote:
>
>> On Oct 9, 2017, at 14:29, Tom Lane wrote:
>> Hmm. Creating or dropping a temp table does take AccessExclusiveLock,
>> just as it does for a non-temp table. In principle we'd not have to
>>
> On Oct 9, 2017, at 14:29, Tom Lane wrote:
> Hmm. Creating or dropping a temp table does take AccessExclusiveLock,
> just as it does for a non-temp table. In principle we'd not have to
> transmit those locks to standbys, but I doubt that the WAL code has
> enough knowledge
Christophe Pettus writes:
>> On Oct 9, 2017, at 13:26, Tom Lane wrote:
>> My bet is that the source server did something that's provoking O(N^2)
>> behavior in the standby server's lock management. It's hard to say
>> exactly what, but I'm wondering about
> On Oct 9, 2017, at 13:26, Tom Lane wrote:
> My bet is that the source server did something that's provoking O(N^2)
> behavior in the standby server's lock management. It's hard to say
> exactly what, but I'm wondering about something like a plpgsql function
> taking an
> On Oct 9, 2017, at 13:26, Tom Lane wrote:
>
> Oh, that's really interesting. So it's not *just* releasing locks but
> also acquiring them, which says that it is making progress of some sort.
It seems to have leveled out now, and is still grinding away.
> Can you
Christophe Pettus writes:
>> On Oct 9, 2017, at 13:01, Tom Lane wrote:
>> Is that number changing at all?
> Increasing:
> AccessExclusiveLock | 8810
Oh, that's really interesting. So it's not *just* releasing locks but
also acquiring them, which says
> On Oct 9, 2017, at 13:01, Tom Lane wrote:
> Hmm. Is it possible that the process is replaying the abort of a
> transaction with a lot of subtransactions?
That's possible, although we're now talking about an hours-long delay at this
point.
> Is that number changing at
Christophe Pettus writes:
> The other observation is that the startup process is holding a *lot* of locks:
Hmm. Is it possible that the process is replaying the abort of a
transaction with a lot of subtransactions? It seems like maybe
you could be getting into an O(N^2)
On Oct 9, 2017, at 12:18, Christophe Pettus wrote:
>
> #0 0x558812f4f1da in ?? ()
> #1 0x558812f4f8cb in StandbyReleaseLockTree ()
> #2 0x558812d718ee in ?? ()
> #3 0x558812d75520 in xact_redo ()
> #4 0x558812d7f713 in StartupXLOG ()
> #5
> On Oct 9, 2017, at 12:10, Tom Lane wrote:
>
> Attach to startup process with gdb, and get a stack trace?
#0 0x558812f4f1da in ?? ()
#1 0x558812f4f8cb in StandbyReleaseLockTree ()
#2 0x558812d718ee in ?? ()
#3 0x558812d75520 in xact_redo ()
#4
Christophe Pettus writes:
> We're dealing with a 9.5.5 database with the symptom that, after a certain
> amount of time after restart, the startup process reaches a certain WAL
> segment, and stops. The startup process runs at 100% CPU, with no output
> from strace. There
25 matches
Mail list logo