Public bug reported:

Ubuntu 15.10

Postgresql 9.5+175.pgdg15.10+1

postgresql-common 175.pgdg15.10+1


# How to reproduce

Execute 'echo b > /proc/sysrq-trigger' during postgres workload

After machine restart, systemd try to start cluster through
pg_ctlcluster and failed

Log messages:

2016-10-18 15:22:50 MSK [5513-1] LOG:  database system was interrupted; last 
known up at: 2016-10-18 15:08:50 MSK
2016-10-18 15:22:50 MSK [5513-2] LOG:  database system was not properly shut 
down; automatic recovery in progress2016-10-18 15:22:50 MSK [5513-3] LOG:  redo 
starts at A/ED186BA0
2016-10-18 15:22:50 MSK [5530-1] [н/д]@[н/д] LOG:  incomplete startup packet
2016-10-18 15:22:51 MSK [5547-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:51 MSK [5550-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:52 MSK [5553-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:52 MSK [5556-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:53 MSK [5559-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:53 MSK [5562-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:54 MSK [5565-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:54 MSK [5570-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:55 MSK [5573-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:55 MSK [5576-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:56 MSK [5579-1] postgres@postgres FATAL:  the database system 
is starting up
2016-10-18 15:22:56 MSK [5508-1] LOG:  received smart shutdown request
2016-10-18 15:22:56 MSK [5580-1] LOG:  shutting down
2016-10-18 15:22:56 MSK [5580-2] LOG:  database system is shut down


# Why it is happens

pg_ctlcluster check cluster is running through psql

pg_ctlcluster contain func with name cluster_port_ready check:

  while ($n < ($result ? 10 : 3)) {
        select undef, undef, undef, 0.5;
        $out = `$psql -h '$sd' --port $p -l 2>&1 > /dev/null`;

        print STDERR "PSQL res: $out $?\n";

        if ($? == $result) {
            $n++;
        } else {
            $n = 0;
        }
        $result = $?;
    }

That func check error code after executing psql. Max 10 times with
interval 0.5s, so 5s is maximum time to postmaster restoring after
crashing. After that pg_ctlcluster return exit code 1 and systemd send
SIGTERM to postgres.


But postmaster cannot accept any connection during restore procedure

postmaser.c:2164
                case CAC_STARTUP:
                        ereport(FATAL,
                                        (errcode(ERRCODE_CANNOT_CONNECT_NOW),
                                         errmsg("the database system is 
starting up")));
                        break;


# How to fix

Increase timeout ?

Check message during connect: FATAL:  the database system is starting up
?

Determine state of recovery and wait when done ?

** Affects: postgresql-common (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1634513

Title:
  Postgres cannot startup after crashing

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/postgresql-common/+bug/1634513/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to