On Tue, Jun 29, 2010 at 10:03 PM, Fujii Masao <masao.fu...@gmail.com> wrote: > This is true. But what I'm concerned about is: > > 1. Backend writes and fsyncs the WAL to the disk > 2. The WAL on the disk gets corrupted > 3. Walsender reads and sends that corrupted WAL image > 4. The master crashes because of the corruption of the disk > 5. The standby attempts to replay the corrupted WAL... PANIC
That sounds like design behavior to me. >> Well, if we want to leave it up to the user/clusterware, the current >> code is possibly adequate, although there are many different log >> messages that could signal this situation, so coding it up might not >> be too trivial. > > So the current code + user-settable-retry-count seems good to me. > If the retry-count is set to 0, we will not see the repeated log > messages. And we might need to provide the parameter specifying > how the standby should behave after exceeding the retry-count: > PANIC or stay-alive-without-retries. > > Choosing PANIC and using the retry-count = 5 would cover your proposed > patch. I'm still having a hard time understanding why anyone would want to configure this value as infinity. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers