Re: What to do when dynamic shared memory control segment is corrupt

2018-06-19 Thread Sherrylyn Branchaw
Yeah, I'd like to know that too. The complaint about corrupt shared memory may be just an unrelated red herring, or it might be a separate effect of whatever the primary failure was ... but I think it was likely not the direct cause of the failure-to-restart. Anyway, I would not be afraid to try

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-19 Thread Alban Hertroys
> On 18 Jun 2018, at 17:34, Sherrylyn Branchaw wrote: > > In the other case, the logs recorded > > LOG: all server processes terminated; reinitializing > LOG: dynamic shared memory control segment is corrupt > LOG: incomplete data in "postmaster.pid": found only 1 newlines while trying >

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Tom Lane
Sherrylyn Branchaw writes: >> Hm ... were these installations built with --enable-cassert? If not, >> an abort trap seems pretty odd. > The packages are installed directly from the yum repos for RHEL. I'm not > aware that --enable-cassert is being used, and we're certainly not > installing from

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Sherrylyn Branchaw
> Hm ... were these installations built with --enable-cassert? If not, > an abort trap seems pretty odd. The packages are installed directly from the yum repos for RHEL. I'm not aware that --enable-cassert is being used, and we're certainly not installing from source. > Those "incomplete data"

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Peter Geoghegan
On Mon, Jun 18, 2018 at 1:03 PM, Tom Lane wrote: > Hm, I supposed that Sherrylyn would've noticed any PANIC entries in > the log. The TRAP message from an assertion failure could've escaped > notice though, even assuming that her logging setup captured it. Unhandled C++ exceptions end up

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Tom Lane
Andres Freund writes: > On 2018-06-18 12:30:13 -0400, Tom Lane wrote: >> Sherrylyn Branchaw writes: >>> LOG: server process (PID 138529) was terminated by signal 6: Aborted >> Hm ... were these installations built with --enable-cassert? If not, >> an abort trap seems pretty odd. > PANIC does

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Andres Freund
On 2018-06-18 12:30:13 -0400, Tom Lane wrote: > Sherrylyn Branchaw writes: > > We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9. > > We recently experienced two similar outages on two different prod > > databases. The error messages from the logs were as follows: > >

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Tom Lane
Sherrylyn Branchaw writes: > We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9. > We recently experienced two similar outages on two different prod > databases. The error messages from the logs were as follows: > LOG: server process (PID 138529) was terminated by signal

What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Sherrylyn Branchaw
Greetings, We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9. We recently experienced two similar outages on two different prod databases. The error messages from the logs were as follows: LOG: server process (PID 138529) was terminated by signal 6: Aborted LOG: