Re: [HACKERS] 10RC1 crash testing MultiXact oddity

2017-09-30 Thread Alvaro Herrera
Jeff Janes wrote:
> On Fri, Sep 22, 2017 at 1:19 PM, Robert Haas  wrote:
> 
> > On Fri, Sep 22, 2017 at 3:39 PM, Jeff Janes  wrote:
> > > It turns out it is not new in pg10.  I spotted in the log file only by
> > > accident while looking for something else.  Now that I am looking for
> > it, I
> > > do see it in 9.6 as well.
> >
> > So I guess the next question is whether it also shows up if you initdb
> > with 9.4.latest and then run the same test.
> >
> 
> git bisect shows that it shows up in 9.5, at this commit:
> 
> commit bd7c348d83a4576163b635010e49dbcac7126f01
> Author: Andres Freund 
> Date:   Sat Sep 26 19:04:25 2015 +0200
> 
> Rework the way multixact truncations work.

Oh man.  And I thought we were done with that stuff :-(

> Not really sure what the next step is here.  I could promote the
> ereport(LOG...) to a PANIC to get a core dump, but I don't think that would
> help because presumably the problem occurred early, when the truncation was
> done, not when it was detected.

Probably the best way to track it down is to add some instrumentation
elog(LOG) to the multixact truncation mechanism.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10RC1 crash testing MultiXact oddity

2017-09-30 Thread Jeff Janes
On Fri, Sep 22, 2017 at 1:19 PM, Robert Haas  wrote:

> On Fri, Sep 22, 2017 at 3:39 PM, Jeff Janes  wrote:
> > It turns out it is not new in pg10.  I spotted in the log file only by
> > accident while looking for something else.  Now that I am looking for
> it, I
> > do see it in 9.6 as well.
>
> So I guess the next question is whether it also shows up if you initdb
> with 9.4.latest and then run the same test.
>

git bisect shows that it shows up in 9.5, at this commit:

commit bd7c348d83a4576163b635010e49dbcac7126f01
Author: Andres Freund 
Date:   Sat Sep 26 19:04:25 2015 +0200

Rework the way multixact truncations work.

The patches which enable the crashes and the rapid consumption of xid and
multixact both need a little adjustment from the 10rc1 versions, so I'm
attaching a combined patch that applies to bd7c348d83.

Not really sure what the next step is here.  I could promote the
ereport(LOG...) to a PANIC to get a core dump, but I don't think that would
help because presumably the problem occurred early, when the truncation was
done, not when it was detected.

Cheers,

Jeff


crash_instrument.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10RC1 crash testing MultiXact oddity

2017-09-22 Thread Robert Haas
On Fri, Sep 22, 2017 at 3:39 PM, Jeff Janes  wrote:
> It turns out it is not new in pg10.  I spotted in the log file only by
> accident while looking for something else.  Now that I am looking for it, I
> do see it in 9.6 as well.

So I guess the next question is whether it also shows up if you initdb
with 9.4.latest and then run the same test.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10RC1 crash testing MultiXact oddity

2017-09-22 Thread Jeff Janes
On Fri, Sep 22, 2017 at 8:47 AM, Alvaro Herrera 
wrote:

> Jeff Janes wrote:
> > I am running some crash recovery testing against 10rc1 by injecting torn
> > page writes, using a test case which generates a lot of multixact, some
> > naturally by doing a lot fk updates, but most artificially by calling
> > the pg_burn_multixact function from one of the attached patches.
>
> Is this new in pg10, or do you also see it in 9.6?
>

It turns out it is not new in pg10.  I spotted in the log file only by
accident while looking for something else.  Now that I am looking for it, I
do see it in 9.6 as well.

Cheers,

Jeff


Re: [HACKERS] 10RC1 crash testing MultiXact oddity

2017-09-22 Thread Robert Haas
On Fri, Sep 22, 2017 at 11:37 AM, Jeff Janes  wrote:
> Is the presence of this log message something that needs to be investigated
> further?

I'd say yes.  It sounds like we have a race condition someplace that
previous fixes in this area failed to adequately understand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10RC1 crash testing MultiXact oddity

2017-09-22 Thread Alvaro Herrera
Jeff Janes wrote:
> I am running some crash recovery testing against 10rc1 by injecting torn
> page writes, using a test case which generates a lot of multixact, some
> naturally by doing a lot fk updates, but most artificially by calling
> the pg_burn_multixact function from one of the attached patches.

Is this new in pg10, or do you also see it in 9.6?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers