On Fri, Jun 11, 2010 at 9:43 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On Thu, 2010-06-10 at 19:01 +0300, Heikki Linnakangas wrote: >> > >> > What "warning message" are we talking about? All the error cases I can >> > think of in WAL-application are ERROR, or likely even PANIC. >> >> We're talking about a corrupt record (incorrect CRC, incorrect backlink >> etc.), not errors within redo functions. During crash recovery, a >> corrupt record means you've reached end of WAL. In standby mode, when >> streaming WAL from master, that shouldn't happen, and it's not clear >> what to do if it does. PANIC is not a good idea, at least if the server >> uses hot standby, because that only makes the situation worse from >> availability point of view. So we log the error as a WARNING, and keep >> retrying. It's unlikely that the problem will just go away, but we keep >> retrying anyway in the hope that it does. However, it seems that we're >> too aggressive with the retries. > > If my streaming replication stops working, I want to know about it as > soon as possible. WARNING just doesn't cut it. > > This needs some better thought. > > If we PANIC, then surely it will PANIC again when we restart unless we > do something. So we can't do that. But we need to do something better > than > > WARNING there is a bug that will likely cause major data loss > HINT you'll be sacked if you miss this message
+1. I was making this same argument (less eloquently) upthread. I particularly like the errhint(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers